Design and Testing of On-Demand Distributed Content Models for
Transcription
Design and Testing of On-Demand Distributed Content Models for
Mat Thomas S75245/BAFM1107 RA/MM/FM 303: Research Project Design and Testing of On-Demand Distributed Content Models for Distribution of High Definition Media Over IP Networks Submitted in Partial Fulfilment of the Bachelor of Arts (Hons) Film Making Student Name: Mathew Aron Thomas Student Number: S75245 Course code: BAFM1107 Due Date: 16th January 2009 WORD COUNT: 13,861(body)/16,805(body+ref) Module Lecturer: Hardie Tucker Page 1 of 75 Mat Thomas S75245/BAFM1107 Table of Contents Summary 4 Chapter 1 5 1.1 Introduction 1.2 Purpose Statement 1.3 Outline of the Problem Chapter 2 2.1 Hypothesis Statement 2.2 Definition of Terms 2.3 Scope of Research 2.4 Research Questions 2.5 Literature Reviews 2.5.1 Literature Review 1 2.5.3 Literature Review 2 2.5.4 Literature Review 3 Chapter 3 3.0 Overview 3.1 Research Family 3.2 Research Approach 3.3 Data Collection Methods 3.3.1 Questionnaire 3.3.2 Experimentation 3.4 Ethical Issues 3.5 Data Collection Summary Chapter 4 4.1 Project Schedule 4.2 Resources 4.2.1 Hardware 4.2.2 Software 4.2.3 Selection Algorithm 4.2.4 Financial 4.2.4 Human Chapter 5 5.1 Overview 5.2 Results 5.2.1 Gender of Respondents 5.2.2 Location of Respondents 5.2.3 Amount of movies watched (Hours per week) 5.2.4 Viewing Method 5.2.5 Amount willing to pay 5.2.6 Own an HDTV 5.2.7 Own HDTV Equipment 5.2.8 Amount of HD Watched (per week) 5.2.9 Amount Willing to Pay if no HD Viewed 5 5 6 7 7 7 7 8 9 9 10 11 12 12 14 14 15 16 18 19 20 21 21 21 21 22 25 26 26 27 27 27 27 27 27 28 28 29 29 29 29 Page 2 of 75 Mat Thomas S75245/BAFM1107 5.3 Project Activity 5.3.1 Pre-Buffer Data-Flow 5.3.2 Results of the Tests 5.3.3 Summary of Results 5.3.4 Final Test 5.4 Problems/Limitations of the system 30 30 31 34 34 35 5.4.1 Power Usage 5.4.2 Internet Connection Usage 5.4.3 Fast-forwarding 5.4.4 Licensing 35 35 36 36 5.5 Potential Benefits of the System 37 5.5.1 Advertising 5.5.2 Content Rights 5.5.3 Indie Producers 5.5.4 Expandability 5.5.5 Data Amounts Chapter 6 6.1 Conclusions 6.2 Recommendations Chapter 7 7.1 Endnotes 7.2 References 7.3 Appendices 7.3.1 Scan from “HDTV for Dummies” page 86 7.3.2 SD/HD Resolution Comparison Table 7.3.3 Course Flowchart 7.3.4 Example Questionnaire 7.3.5 Project Schedule 7.3.6 Hardware Testing 7.3.7.1 Server - Installing Windows XP 7.3.7.2 Server - Network Speed 7.3.7.3 Server - Network Analyser 7.3.7.4 Server Configuration 7.3.7.5 Server - Splitting of Media File 7.3.8.1 Client - Network Analyser 7.3.8.2 Client - Simulating ADSL Bandwidth 7.3.9.1 Viewer - Connection to Server 7.3.9.2 Viewer - Method 7.3.10 Viewing Bias Code 7.3.11.1 Original Virtual Data Flow Diagram 7.3.11.2 Refined Virtual Data Flow Diagram 7.3.12.1 Questionnaire Results 7.3.12.2 Questionnaire Analysis 7.3.13 Data Flow Decisions 7.3.14 Initial Testing Results 7.3.15 Penultimate Test Results 7.3.16 Final Test Results 7.3.17 Spec Sheet of Proposed Hardware (from module 301: Business Plan) 7.3.18 Specialised Application (from module 302: Specialised Application) 37 37 38 38 38 39 39 40 41 41 43 45 45 45 46 47 48 49 50 51 52 53 55 56 57 58 59 60 62 62 63 64 66 67 67 67 68 69 Page 3 of 75 Mat Thomas S75245/BAFM1107 Summary In my opinion, high-definition media within the industry is not growing as fast as it could. A possible reason for this is that peoples expectations of content have changed over the years. I firmly believe that people want to watch what they want when they want, an opposite to the current TV broadcast model. A problem associated with content of all kinds, not just high definition, is that of distribution. Originally aided by the automobile and transport links, the current problem is that due to the widespread physical location and increase in the number of viewers this solution is no longer viable. This has been partially solved in cinemas by using satellites and hard-drives containing the films data. In my opinion home entertainment has been “left behind” in this field, especially when dealing with large amounts of data, such as high definition audio and video. My theory is that if people had a choice of high definition films in their home to view when they wanted, they would. My research aims to develop a system capable of using current technologies to allow the above to happen. It will work by exploiting the “down-time” of the the person (while they are at work/asleep) to its advantage. Based on this and further research into the technological methods I could potentially use, a system was developed containing a total of 12 clients (viewers) and 1 server. The evaluation of the project was one simple criteria: to be able to flawlessly playback HD content without any physical media changing hands, using current technologies. I would suggest that any future researchers attempt to create a wider test-bed in order to gain more useful feedback. Page 4 of 75 Mat Thomas S75245/BAFM1107 Chapter 1 1.1 Introduction The problem of distributing media to an audience is one as old as media itself. What use is media without a method of delivering it to an audience? Traditional means of delivery use physical media such as vinyl records, audio tape and CDʼs. If we look at the film counterparts to these we can also include VHS, Laserdisc, VCD, DVD and more recently HD-DVD and Blu-Ray. Alternative nonphysical methods of delivery include AM/FM radio and broadcast (UHF/VHF) TV. The non-physical methods of delivering media, more specifically “high definition” films and TV shows, are what I am concerned with. To take this a step further and narrow my scope of research, I will be looking at IP based distribution methods. This broadly means “the internet”, however it is important to identify the difference between developing for the internet, and developing for an IP based system. If we develop for the internet then we are constricted by certain restrictions of the internet and consequently have to develop to these standards. If we develop for IP based systems we can design the system to work with a multitude of standards as well as potential for operating the system over the internet. Based on previous work with networking, I will not attempt to run any tests on a connection which has less downstream bandwidth than 512kbps, or half a megabit. This is because although many ISPʼs will class it as “broadband”, 512kbps is not a sufficient amount of bandwidth to provide accurate results as the overheads involved with IP networking limit the connection to around 460kbps. Broadband is a relative term, and although many people understand a 512kbps connection to be broadband, the FCC (America) regard anything over 768kbps to be “broadband” 1. 1.2 Purpose Statement My project and associated research has lead me towards my intended destination within the industry. I am interested in the field of media distribution and more specifically using computer based systems to improve the speed, quality and reliability of distribution. My project is to research and ultimately develop a system which allows for on-demand viewing of high-definition content. The benefit of researching this topic is that it is directly related to my project. The industry is changing rapidly and to create a system which utilises the full potential of current technologies, it is important to know what the current systems are, what they can do and most importantly what they cannot. Page 5 of 75 Mat Thomas S75245/BAFM1107 1.3 Outline of the Problem In my own opinion, the way in which audiences view films and TV shows is constantly changing. To capitalise on both new and existing audiences, distributors need to be at the forefront of technology. In the past it seems that movie studios have generally resisted new media formats and distribution methods.2 According to the article, it seems that there are three main reasons why studios generally resist adopting new distribution methods. The primary reason given by the article is the issue of piracy; studios are extremely concerned with protecting their investments. A secondary reason is that of resistance by the general public; either for reasons of cost or lack of compatibility with their existing equipment. The final reason is that if customers can watch high-quality movies in their home, what would happen to the business of cinema? A number of formats have come and gone over the years, with varying success, and I will attempt to present in greater detail the reasons for success/failure. This will involve amounts of research from the perspectives of both producers and consumers. “Format wars”, or situations in time where two or more comparable methods of distribution have entered the marketplace at roughly the same time, are not a new phenomenon. An early example of this can be seen in the late 1880s with regards to distribution of electricity, and whether to use AC (alternating current) or DC (direct current).3 From my own observations of the distribution of content, it appears that in the majority of countries, excluding the USA and Japan, there is currently no legal model for viewing HD content without physical media (such as Blu-Ray/HD-DVD). The problem with this is that younger generations want to view media how they want, when they want and where they want. From my own research of asking people in my age group (18-25), apparently current solutions are well below that of what current technology is able to deliver. Furthermore it seems that much of this sample are switching to illegal methods of obtaining media for the simple reason that there are little alternatives which suit their desires.4 Page 6 of 75 Mat Thomas S75245/BAFM1107 Chapter 2 2.1 Hypothesis Statement Design and Testing of On-Demand Distributed Content Models for Distribution of High Definition Media Over IP Networks. 2.2 Definition of Terms Test: A scientific approach will be used based on my previous learning's. The main technical approach to testing throughout this project is that of “white box testing”5. I am able to use this method of testing as I wrote all of the code which I will be testing. This method is by far the most efficient as the tester has the ability to change nearly every variable, however it is not commonly used as it requires the tester to have full knowledge of the system being tested. On-Demand: a viewer ʻpullsʼ, or requests media from a content server Model: “a representation, generally in miniature, to show the construction of something” 6 High-Definition: “at least 720 horizontal lines of resolution with progressive scan” 7 IP: “the main network layer (3) for the TCP/IP protocol suite” 8 or one of the most common network protocols, as used within the internet, but not exclusively. Network: “a system of interconnected electronic components”9 2.3 Scope of Research My research towards this project is primarily focussed on experimental testing of various distribution methods, and their related performances, advantages and practicalities. I will have to initially decide on which methods I will test, as there are potentially thousands to choose from. It is important to note that I will be testing the technical and practical aspects of the systems, with little focus on the aesthetics of each method of distribution. I will not be concerned with methods of limiting piracy, I will however allow for the system to be compatible with common types of rights management. I have identified the following as being topics which are out of the scope of this research; the content which I will be transferring is essentially irrelevant and will not be judged. Another aspect of the research/project which I will not be concerned with is the usability of the software nor the set-up of the systems. Although it is required for me to complete the project/research, I am focussing on the transmission methods, and while interface is an obvious component of media distribution, I do not have the time to design a GUI, furthermore a GUI for navigating on a TV could be a research project in itself. Page 7 of 75 Mat Thomas S75245/BAFM1107 2.4 Research Questions Some questions which arise regarding my research are listed below: 1. What current alternative methods of delivery are available? 2. Which codecs should be used? 3. What is the current state of worldwide IP communications? 4. What devices will the media be playing on? 5. What do audiences want to watch? 6. What HD standard do audiences want? 7. How much would people use a service like this? 8. Why is there not a current solution to the problem identified above? 9. Has there been research in this specific field already? 10.Who would be interested in this research / project? Some less relevant questions which still have a bearing on the project but are not critical to success: What legal problems are involved with this method of distribution? How can we prevent / limit piracy? Many of these questions lead to further questions with little potential for a definite answer. This is not necessarily a problem, just something to be aware of when presenting the findings later on in this document. Page 8 of 75 Mat Thomas S75245/BAFM1107 2.5 Literature Reviews 2.5.1 Literature Review 1 HDTV for Dummies, Briere, Danny & Hurley, Pat, Wiley Publishing, 2007 The authors of this book have both got a great deal of credibility within the media industry, more specifically the technology side. Danny Briere is “one of the original modern-day telecom strategists, having advised carriers, vendors, regulators, start-ups, and governments for more than 20 years” 10. The source also goes on to say that “Briere is recognised as one of the most creative and innovative telecom strategists in the industry today.”11 The source for the aforementioned quotes are from a company called “TeleChoice”, who define themselves as “the leading strategic catalyst for the telecom industry”12. This source indicates that the author is the CEO of the company, so we have to consider this source potentially heavily biased towards painting the author in a positive light. For the second author, Pat Hurley, it seems that he has a very detailed knowledge of logic, which forms the basis of all computing technologies 13. He also has written or co-written many other books of the “dummies” series, focussing on media and its associated technologies. The authors make no exaggeration as to the complexity of the content within the book. It is designed for beginners to understand the terms and related technologies for HDTV. The reason I believe this book will prove useful to me, is that quick referencing of terms that I donʼt understand is completely necessary within this contemporary, still evolving technology. After reading the book further, I realised that it wasnʼt lacking in detail as much as I thought it was. In certain areas the authors describe in immense detail certain aspects of this technology, for example “Table 6.1” 14 (Appendix 7.3.1), which explains the bandwidths, modulation types and bit rates of some popular distribution mediums. Furthermore, detailed information like this are always externally referenced. This allows me to follow the reference for possibly even more specific information from the same original source. The book also includes some topics I will most definitely not be requiring; namely “Mounting your HDTV” and “Places to buy an HDTV”. These sections are designed for consumers, and although they both heavily reference external sources, they are of no interest to my field of research. Page 9 of 75 Mat Thomas S75245/BAFM1107 2.5.3 Literature Review 2 (The Technology of) Video & Audio Streaming, Austerberry, David, Focal Press, 2005 This book has been praised by the European Broadcasting Union and they said “for broadcasters... involved in media delivery across the web, this book could be a very useful first step in understanding the basics of streaming technologies”. The author has worked in the industry of media communications for over 30 years, for 10 of those he was a project engineer for the BBC(UK).15 The content of the book is highly detailed, focussing on the more technical aspects of media streaming. It covers everything from the history of print and media development, aspects of convergence within the industry and IP networks, right up to encoding methods, webcasting and DRM (digital rights management). The book provides excellent referencing, along with a clear and concise glossary and abbreviations section. This part of the book is useful for quick look-ups of terms, while the previous 300 pages are designed for advanced technical reading. I believe there would be many literary prerequisites for this content, however from reading sections I have ascertained that I have enough knowledge in this field for the book to be useful for my research. Each section has its own summary, complete with further reading sources. The book has sections which are not as relevant to my research as others. These include sections detailing content creation methods for particular applications, such as webcasting and audio. Although these sections will not be called upon, there is still useful information held within them, for example frequency responses of average consumer sound systems. Information such as this could become useful when testing in the future. Sections I am particularly interested in, which will form much of my research, include encoding techniques, “Multicasting” to multiple devices using multiple encoding methods and finally rights management. All the above sections are of interest to my research, since they make up around 70% of my project. The only disadvantage with this book, is that it was last revised in 2005, and it is now 2008. This means that potentially 3 years worth of technical advances will not be included in this book. 3 years in technological terms is roughly between double and triple the levels of technology within the industry, if we follow Moores Law 16. Because of this I will need to try to find a newer revision of the book, or consult external sources for further, more accurate and contemporary information. Page 10 of 75 Mat Thomas S75245/BAFM1107 2.5.4 Literature Review 3 Film and Television Distribution and the Internet, Sparrow, Andrew, Gower Publishing, May 2007 This book covers primarily the laws / regulations and current trends concerning internet distribution of film and TV media. The way in which the author presents this information is very clear and simple. He has decided to group together associated laws and regulations, present them in their entirety and then break them down into their key points. This approach to complicated law breakdowns is welcome, as it allows any individual interested in this field to understand perhaps the largest obstacle to the media industry and distribution. The author himself is solicitor in the UK and his primary interest is in media distribution law. He was acknowledged in 2004 as “one of the 100 individuals in the UK who have contributed most to the development of the internet in the last 10 years” by the DTI17. The author provides plenty of further reading in this field, however the majority of it is written by the author himself. This in turn could provide a bias opinion on the various laws and regulations. There is, however, a large selection of related material written by other authors. Law is something which I have little knowledge about and after reading sections of the book, it becomes apparent that it is an extremely important component of the industry. To research various distribution methods and be able to work effectively within this field, a knowledge of any related potential hinderances, such as The Data Protection Act of 199818, is crucial. I believe this book will be one of the most helpful towards my research and project, as it is possibly the most important aspect of the industry with regards to new distribution technologies. Page 11 of 75 Mat Thomas S75245/BAFM1107 Chapter 3 3.0 Overview One of the primary reasons for undertaking this research is that I personally feel that the distribution models employed by the media industry are lagging behind what is technically possible. I feel that the music industry realised this extremely late, only after services such as Napster19 became mainstream. It is only recently that major record labels have adopted the internet distribution model. This was hard-hitting for the industry as the majors had pretty much missed an opportunity to increase publicity, revenue streams and audience in one go. I believe that they were more concerned with trying to protect their (then) current models that were quickly becoming outdated. The only reason in my opinion that this has not really happened to the film industry yet is that the technical conditions are only just now becoming aligned with viewers expectations. Put simply, a 128kbps MP3 audio file is around 3mb in data size, which is no more than 5 minutes downloading time on even the slowest of internet connections. A feature length movie, on the other hand, even when heavily compressed, is around 700mb. This in turn means potentially having to wait around 18 hours for the download to complete. People are generally quite impatient and if the process is going to be as long as 18 hours for one movie then they will probably seek alternatives. This in my opinion makes the entire industry complacent. Based on the above opinions it is clear to see that it is simply a matter of time before the technology “catches up” and the times taken are severely reduced and in turn peoples propensity to obtain movies using whatever method increases. I also believe that there is another factor which helps shift the “power” back to studios. The idea of “HD” content and the benefits associated with it are major factors in customers purchasing habits. HD also means that the amount of data required is increased significantly, at least 4 times that of “SD”20 (see appendix 7.3.2). This obviously means that once again the movie industry is in a strategic position which the music industry can only dream of. Again, based on my own observation, the movie industry isnʼt capitalising on this potential market for many reasons. The internet and IP systems are still essentially in their infancy and issues such as piracy, guaranteed service levels, security and licences instantly arise as soon as you begin to discuss this very issue with industry professionals. I believe, however, that these issues can be resolved and it seems that more and more trust is being put into this field of the industry, but not at a quick enough pace. I intend to focus my research on technical constraints and quality of service. Page 12 of 75 Mat Thomas S75245/BAFM1107 A brief description of my background may help to understand my reasoning for undertaking this project. I have always been interested in computers and more specifically data communications. I previously attempted to undertake this very same research in the UK, at UMIST while I was studying for a BSc in Computing Science. I felt that the course had modelled its educational rationale on that of a “factory line”. It seemed that people were being taken in and as long as they completed the expected work they would pass. I expected that creativity would be put behind productivity as it was a BSc course at an antiquated UK institution, however I was shocked as to how much; I proposed this very same research/project as my 3rd year project. I was surprised to hear that no lecturer wanted to assist me, a requirement of the project. When I consulted my personal tutor very little assistance was given other than “choose something simpler”. The original reasoning for the main hypothesis of this paper was based on both personal experience and observation of others. I noticed that people within my age group (younger people even more so) would prefer to watch what they wanted and when they wanted. It seemed to me that the old content distribution model of a TV broadcast system, where people watch what the broadcast company wants them to, was being rejected by younger generations. As soon as the technology and community is in place then we can see real-life examples of this, one of the most successful is YouTube. YouTube is often seen as little or no threat to TV, however I believe that the very fact that people are accessing content in a different manner to traditional methods is enough reason to “listen” to the audience. Even with the (when compared to TV) poor content and poor quality available on YouTube, people still want to view the content available. I believe this is party due to a perceived choice of what to watch (it is possible to obtain a higher ranking in YouTube search results by paying a fee therefor lessening the real choice). One big downside from my experience of YouTube and similar “video streaming” websites is a phenomenon called “buffering”. A buffer is supposed to allow for smooth, uninterrupted playback no matter what connection you are using. This is wonderful in theory however in practice it doesnʼt always work properly. This is something which I want to include into my project, or as I will refer to it from now on “quality of service”. When I watch a movie, I want to watch it entirely uninterrupted otherwise it detracts from the “flow” of the movie. With all the above in mind, I wanted to try and look into the possibility of somehow merging the community idea of YouTube with movies, while keeping up with current technologies and providing flawless playback to appeal to the widest audience possible. A flowchart outlining this can be found in section 7.3.3. Page 13 of 75 Mat Thomas S75245/BAFM1107 3.1 Research Family My research primarily contains quantitative data, as the project is an experiment and the results of which will be in numerical format. There is, however, a limited amount of qualitative data to process. This is because I have to analyse human responses to open ended questions about viewing habits; when, what, why and how. I am able to make qualitative statements derived from the qualitative data, however. The data is split around 80% quantitative and 20% qualitative. I will be using a combination of experimentation, targeted questionnaire and a limited amount of unstructured feedback. The primary method of data collection before I began the project was a questionnaire, the results from the testing phase of the project were based on an experimentation approach and finally feedback was sought from industry professionals, however a lot of it was unstructured. 3.2 Research Approach My research uses one main type of approach, the “experimentation” method. I have attempted to investigate various methods of distribution by testing their performance using an array of computers simulating demand. “Experimentation” is defined as “the process of testing a hypothesis by collecting data under controlled, repeatable conditions”21. To effectively produce credible results from an experiment I needed to be able control my testing and be able to repeat it. I achieved this by running 4 computers continuously, one being a server for media (simulating the broadcast company), and the other three running four “virtual machines” 22, giving a total of 12 clients (simulating the viewers). The systems did not display the content, they just simulated the speed of transfers which could be achieved over IP networks. The reason for using virtual machines is twofold; it reduces total expenditure of the project by limiting physical machines whilst allowing for relatively easy control of test conditions. By creating one “image” of a system running under different conditions, it is extremely easy to replicate this on multiple virtual machines, whilst still retaining individuality of each “client”. If the research were to be conducted without thought for cost, then I could rent 13 additional ADSL internet connections, and connect each system to the internet individually. This however would cost somewhere around 26,000฿ (£380/$790) per month. My solution to this problem therefore is that the clients ran on a LAN at 1000mbit/sec. The data rates of each client were “capped”, or limited, to simulate real life conditions, depending on the test being run. Page 14 of 75 Mat Thomas S75245/BAFM1107 I ran each test for one week to simulate a typical “weeks viewing” from a client. I aimed to run 8 different tests, modifying the following variables, narrowing their parameters later based on further research: bandwidth available, containers/codecs used, distribution methods (server-client, peerpeer, etc.) and resolutions of audio/video. This gives a total of 2 months for the intended tests to complete, however the actual time taken depended on the time taken to set-up the systems and any problems I ran into. 3.3 Data Collection Methods After consulting various sources regarding data collection methods I had decided to attempt to triangulate data sources. According to the University of Bolton23 , triangulation is “a way of assuring the validity of research results through the use of a variety of research methods and approaches. It is a means of overcoming the weaknesses and biases which can arise from the use of only one of the methods we have described, such as observation, questionnaires etc” 24 If we were to attempt to triangulate this statement in itself, I would draw on my personal learning's from in class along with informal interviews with other researchers. In essence, the more data sources you can acquire which either corroborate or contradict the other data sources, the better. Based on the above, I decided to collect data from the following sources (and their types) and further on I will attempt to triangulate specific data: Questionnaire Primary data Experimentation Primary data Industry Professionals Primary/Secondary data Document Reviews Secondary data Previous work Combination of data types Page 15 of 75 Mat Thomas S75245/BAFM1107 3.3.1 Questionnaire To obtain information regarding viewing habits, I looked for studies already completed, as a sample population available to me would have been far too small for the scope of this project. This was accessed from government organisations from various countries 2526 . To further enhance the credibility of this data, I sampled a population of TV viewers and provided them with a simple questionnaire regarding their current viewing habits. I attempted to mirror the questions already answered from external sources, to make the triangulation of data reliable. Questions and potential answers I asked are as follows: • Age • Gender • Location • How often do you watch movies • Less than 1 hour per week / between 1 hour and 5 hours per week etc. • What method of viewing do you use mostly and why • Cinema, TV, PC, Mobile Phone, PMP • Do you own an HDTV (capable of displaying 720p material) • Do you own any equipment which allows you to watch HD material • Satellite, BluRay/HD-DVD, Media Centre, HD-DVR (HDD Based), PlayStation3™ • How often do you watch HD material? • Less than 1 hour per week / between 1 hour and 5 hours per week etc. • How much would you be willing to pay (per month) for unlimited HD content on demand? • 0B-500฿, 501฿-1000฿, 1001฿ - 2000฿ etc. My reasoning for producing a questionnaire as the one outlined above mainly came after suggestion from Thames Valley University 27. They explained the key points to producing a successful questionnaire; keep it short, use unambiguous questions, questions should only address a single issue, to name but a few. Page 16 of 75 Mat Thomas S75245/BAFM1107 I initially posted this onto a survey website with a low level of success. What I found was that I received a massive amount of responses (over 300 submissions), however the results weren't particularly useful. It became quite clear to me that people who took the survey already had a keen interest in my field of interest; HD film and thus the results were atypical. I believe the reason for this was that I posted the survey onto a popular forum for discussing everything video related; doom9.org28. The results from the initial survey were extremely biased towards “power-viewers”; people whose hobby is digital media. In turn it was very rare to see answers from people who didnʼt have much interest outside of “power user” viewing habits. Another point of note was that as the surveys were effectively anonymous, I donʼt believe that many of the responses given were entirely accurate. From my own observation it seems like many people on forums such as these (which otherwise provide invaluable information) tend to try and out-do one another by way of exaggeration. For example for the question “ Do you own any equipment which allows you to watch HD material?” the responses were nearly all “yes” (~98%). This, in my opinion, is a combination of both bravado and my choice of sample population. Considering the above issues, I decided to drastically change my method of both sampling and questioning. I prepared an open ended questionnaire with the same questions (minus age/sex/ location questions) and sent it to people whom I actually know, friends and family. I also put a statement at the beginning of the email explaining the point of the questionnaire and encouraged people not to lie or exaggerate. My basis for doing this was that I already know roughly about peoples viewing habits, as they are friends/family, and that a more personal questionnaire would yield more accurate results. (original email available in appendix 7.3.4) The questionnaire was emailed to 25 people, out of which I received 21 responses. I was disappointed in both the number of available people and the response rate at first, after having over 300 replies initially, however as soon as the data was inputted into a spreadsheet it became apparent that quality of data is far more important than quantity of data. In retrospect I think that I could have obtained a balance between quantity/quality by posting the questionnaire to my Facebook page, by way of a mass message, however I believe that the results I obtained are sufficient for my research as they match my expectations. Page 17 of 75 Mat Thomas S75245/BAFM1107 3.3.2 Experimentation I left all systems running 24 hours per day, for one week at a time. I varied usage time & content accessed on each system (to simulate different customers). The systems each logged their individual statistics, these include • Average data rates achieved • Time taken between the request of the media and the delivery of the media • Whether viewing was interrupted and why • Total bandwidth used The results of these experiments are presented to me in their raw form, this being simply numbers for the above test factors. The benefit of using logging software for this research is that I can choose specifically what I want to analyse after the test have completed. For example the software allows me to log everything related to the tests, and I can choose date ranges and types of data to extract and analyse. Examples of data for the above tests will include numerical data presented in a computing specific format; KB/s (kilobytes per second) for data rates, some exponential on bytes for total bandwidth used and so forth. For more technical data such as MTU size (maximum transmission unit) I required further research into ways to interpret the data. To attach credibility to my results I ran multiple logging systems on each machine and compared the results. My micro-hypothesis is that “every result from every machine should collaborate with each other machine, allowing for a small margin of discrepancy”. I achieved this by running the following logging systems: • 1 system-wide logging software application running on • server machine (total of 1) • client machines (total of 3) • virtual machines (total of 12) • 1 application specific logging software (the application varied on each test) • server machine (total of 1) • virtual machines (total of 12) • 1 overall logging utility from the network switching hub The above systems provide one primary data collector per “virtual client”, one per “client machine”, two from the server and one more from the switching hub. This in effect gives us 5 data sources, one which is a base and 4 more from 3 different sources to collaborate with. Page 18 of 75 Mat Thomas S75245/BAFM1107 These results will be used to compare the different methods applied to deduce a final system to test further which will form the basis of my project. I need to be able to verify the results, which is why I observed various systems at set intervals and logged the results manually. The problem with this method of data collection is that if a system fails for any reason and there is no alert to this, potentially an entire week of research will be wasted. This is why I included fail-safes into each system; if a problem is detected, it is noted and the system is automatically restarted to minimise downtime. However this potentially random downtime can add substance to the research if it is managed effectively. This is because in a real life situation external factors will affect the results of the system. For example if there were a power-cut to one of the systems, it will affect the performance of other systems on the network. As long as these outages are logged and included in the data breakdowns then the results will accurately represent the tests performed. Considering the aforementioned data collection methods, my research will primarily be “desk” based, with very little “field” research. 3.4 Ethical Issues For my research there are very few ethical issues, however I had to obtain and transfer large amounts of media data from different sources. As the media itself isnʼt being critiqued in this research, there is no reason why it canʼt be a compilation of usable media from various open source sources, such as “Mariposa HD”29, which is distributed under a “Creative Commons License”. This particular source allows me to carry out all of my test conditions within the law, however for the final tests I wish to use the movie “Pulp Fiction”30 as I will have to be watching it about 4 times to analyse the “flawlessness” of the playback. Another ethical issue is that of piracy; my systems must be able to handle the inclusion of implementations of DRM (digital rights management) for it to remain a viable transmission method. Above I mentioned that I would not be including any form of anti-piracy systems as it is outside the scope of the research. To clarify this, the system can include DRM however for the purposes of the following tests none was included. Due to the way the algorithm (which selects potential films for the client) works, it becomes apparent that people might have concerns with their privacy; what they are watching and the fact that the system knows this. As with all interconnected systems which transfer any kind of personal data, not only is it crucial that the data being sent is sent in a secure manner, but it is also wise to design the system so that as little as possible information is sent to the server. For example, Page 19 of 75 Mat Thomas S75245/BAFM1107 instead of sending the viewers information for the server to process we can instead send the entire list of movies available to the client and process everything locally. Probably the most important ethical issue to this project is that of energy saving. As the technique I am using for distribution uses the electricity and internet connection of the client while they are away (at work/sleeping etc.), many people might want to switch off the device to save energy. This creates a problem as it is that specific time which allows the system to work effectively. One final issue which I believe should be included in this section is that of my final test subject. Once all of the initial tests had been completed and a working final model was produced, I decided to test it in a more “real-life” situation, as previously mentioned. I used a HD version of the film “Pulp Fiction” and attempted to watch it from start to finish without any problems. The ethical issue arises because in Singapore this particular film is banned, and although no practical demonstration of the model is required, this would have limited my options if I had chosen to do so. 3.5 Data Collection Summary Question Collection Method 1. What current alternative methods of delivery are available? Document Review 2. Which codecs should be used? Spec App 3. What is the current state of worldwide IP communications? Industry Professional 4. What devices will the media be playing on? Questionnaire 5. What do audiences want to watch? Questionnaire 6. What HD standard do audiences want? Document Review 7. How much would people use a service like this? Questionnaire 8. Why is there not a current solution to the problem identified above? Industry Professional 9. Has there been research in this specific field already? Document Review 10. Who would be interested in this research / project? Combination of sources Page 20 of 75 Mat Thomas S75245/BAFM1107 Chapter 4 4.1 Project Schedule I had a project schedule planned before I started the project as outlined in section 7.3.5. This schedule varied a lot, mainly due to reasons of software mismatching, which are detailed in the following section. As of writing I am around 3 weeks behind schedule, however I allowed for a December presentation, which in reality doesnʼt occur until January, so although I am behind on my schedule, I am on track with the course schedule. Quite a few problems were encountered primarily with the software and accuracy of reporting test results. These problems were quickly identified and rectified before they affected the schedule too much. 4.2 Resources This section details the resources needed in order for me to complete my project in its entirety. Due to the nature of the project, the majority of resources will be weighted towards technical rather than nontechnical. The divisions are outlined below: 4.2.1 Hardware The research required a large amount of computer hardware. I already have access to this hardware personally and I will be running four of the following systems, all of which are in my possession and free to use as I wish. A photo of the hardware being set-up is available in section 7.3.6. • Intel Pentium 4, 2.4ghz Processor31 • 1024mb DDR2 RAM • 120gb SATA 7200RPM HDD • 1000mbit Ethernet • Systems are linked using an SMC 8 port 1000mbit Switch32 These systems are perfectly suited for my applications, as they will allow me to test what I plan to without the need to buy any additional equipment. They were divided into four virtual machines all using an equal share of the computers resources. This gives each virtual machine the following specifications: • 600mhz CPU usage • 256mb RAM usage • 30gb storage quota • 250mbit network access Page 21 of 75 Mat Thomas S75245/BAFM1107 The above shared resources are variables which I can alter depending on my needs after the initial test run. They can also be dynamically altered whilst the tests are running, however I would like to limit the number of variables because it would make analysis of the tests significantly more difficult, except for storage space. This is because depending on the test situation, certain clients may require more or less storage space depending on the extents of the test. 4.2.2 Software The software I planned to use for both the research and project included the following. • Virtual Machine - VMWare Workstation33 • OS - Microsoft Windows XP34 • FTP Client - FileZilla35 • FTP Server - FileZilla36 • BitTorrent Tracker - BitComet Tracker37 • BitTorrent Client - Azureus38 • Network Speed Limiter/Monitor - Net Peeker39 The “base” OS is a reduced version of Windows XP, using VMWare to run 4 virtual machines of Windows XP. Only the functions needed for my research are left in this special version of XP, for example there would be no need to install sound drivers, printer drivers, advanced graphics applications etc. This helps reduce processor load and storage space used. My two main methods of proposed transfer are using the FTP and BitTorrent protocols. Their respective client / servers are listed above and they provide the best mass usage as they have both been tested under heavy load 40. This will ensure maximum uptime from the software and therefor accurate results. All the above software contains logging sections, however to improve accuracy I will also use external software to monitor as much as possible without interfering with the performance. I also need to limit the capacity / bandwidth of each virtual machine, and software which allows me to do this is Net Peeker. Net Peeker is also an ideal solution because it can be run on the server and provide statistics on packet transfers to backup the individual statistics on each virtual machine. The initial test-run allowed me to check the software configuration, whilst using user groups for information on best performance settings.41 42 Page 22 of 75 Mat Thomas S75245/BAFM1107 After the initial set-up and testing of systems it was clear to see that the overheads incurred by using the above set-up were too high for consistent data transfers. This was primarily due to limitations imposed by Windows XP, VMWare and my method of viewing the remote machines, VNC. Windows XP will not allow for more than 3 virtual machines to be run at once, VMWare is very inefficient for resource allocation and VNC uses a sizeable portion of the network bandwidth so when I connect to the clients I am actually impeding the flow of data (appendix 7.3.8.1). This was not a problem for the server as there was ample bandwidth (1000mbit) available (appendix 7.3.7.2). I decided to change the method of set-up drastically before too many tests were run in an effort to control the tests with greater accuracy. The solution I decided upon was the following: Server: • Windows XP • Microsoft IIS (FTP + HTTP server) • BitComet Tracker (Torrent tracker) • uTorrent (Torrent server) • VNC (Remote viewing) • Net Peeker (Network Speed Limiter/Monitor) Clients: • Windows Server 2003 • Allows for more than 3 RDP connections • 4x RDP connections from a remote host • RDP is controlled at the host computer level, not virtual client level so does not affect bandwidth on the virtual clients • Azureus (Torrent client) • Azureus allows for binding of IPʼs on each virtual client, as there are multiple virtual interfaces due to the RDP method used • Net Peeker (Network Speed Limiter/Monitor) Page 23 of 75 Mat Thomas S75245/BAFM1107 A diagram (section 7.3.11.1) showing the original flow of data helps to understand the complexities of the above set-up. This diagram is for the planned (first) set-up. As you can see it produces many connections from each virtual client (VClient). This in turn restricts bandwidth available to each virtual client and affected the test dramatically in terms of both stability and performance. With the refined set-up (section 7.3.11.2) it is clear to see that the number of connections are now drastically reduced, and more importantly they originate from the physical client machines rather than the virtual clients. As each physical machine is connected at 1000mbit with 4 virtual machines which are each capped to up to 8mbit (depending on test criteria) it is now possible to allocate a large portion of bandwidth to “services” and administration without affecting the test. With such a number of virtual machines I need a system of organisation. On the “viewing client” (my personal computer) I used a free application called CoRD to RDP into the clients. Once set-up I also utilised Mac OSXʼs feature called “Spaces” which allows you to have multiple desktops to better organise your applications. I opened a connection to each client in a different “space” which resulted in the setup detailed in section 7.3.9.2. As for the server part of the system, I had decided to use Windows XP as opposed to a server based OS because the tasks it was performing were fairly basic. Windows Server will usually come prepared with masses of services (active directory, domain services, printer services etc.) which would simply not be used. Instead of installing a “server” operating system and then reducing it down for my needs, it was simpler to run XP and build up to what I needed. Page 24 of 75 Mat Thomas S75245/BAFM1107 4.2.3 Selection Algorithm As previously stated, I wanted the system to buffer as many movies as possible, but as this could be a fairly random process, I designed an algorithm to select movies based on the users viewing habits. For the testing I didnʼt actually use the algorithm because I was concentrating on the more technical side of the project, however I feel it is an important asset to the project as a whole. The algorithm is located in the appendix section 7.3.10, written in psuedo-code 43, but I will attempt to give a basic outline of what it does here: The algorithm: • creates and clears variables for first use • opens connections to the local and remote database • local DB contains a list of movies on the client and what they have viewed • remote DB contains a list of all movies available on the server • checks to see if there are any matches based on the following criteria and their weightings • (weightings are based on my personal habits and can be changed based on further research) • genre(s) - 40% weighting • actor(s) - 20% weighting • writer(s) - 10% weighting • director(s) - 10% weighting • producer(s) - 10% weighting • sound designer(s) - 5% weighting • director(s) of photography - 5% weighting • produces a list of movies to download based on the above, sorted by ranking • closes the DB connections After this, the system can check the list of movies and prioritise the downloads. The intended effect is that users will have a choice, but where the amount of movies available to buffer is low (on lowbandwidth connections) the system can attempt to buffer movies which the user might find enjoyable. Further research is needed to figure out what factors people watch movies based on their previous viewing. It is possible that some people, for instance, simply choose what movie to watch based on its name. The point of this algorithm is that it can be adapted after more research has been completed. Page 25 of 75 Mat Thomas S75245/BAFM1107 4.2.4 Financial Financial responsibilities for this project have been kept to a minimum. However items which potentially required financing are identified below. • Any hardware components which fail before or during the test - $300/£150/10,000฿ • Actual cost $50/£25/1,500฿ for a broken power-supply • Electricity costs for running 4 machines for an extended duration - $200/£100/7,000฿ • Difficult to judge actual cost but probably around 3,000B • (4 machines @ 400W each running for 10 weeks at a cost of 11B per unit) • Licensing of software • Windows XP • I already own a subscription to MSDN which allows for 25 copies of XP to be used simultaneously for development purposes • VMWare Fusion • $189/£90/6,300฿ • Net-Peeker • $25 for server license, $15 x 12 for clients = $205/£100/7,000฿ • Additional literature required - $200/£100/7,000฿ This budget totals to $1094/£540/37,300฿. The amount stated above was planned for however at the end of the project the total was around $800/£400/28,000฿. n.b. The exchange rates have fluctuated throughout the year but the GBP value is accurate. 4.2.4 Human For human resources I needed one academic supervisor to aid with research / project support and one industry professional to aid with technical / practical support. I nominated Mr. Hardie Tucker as my research / project supervisor, and Dr. Anthony Finkelstein44 for technical / practical support. I have consulted with Dr. Finkelstein regarding my project and where he might be able to assist me. I knew it would be extremely technical as he is head of the Computer Science department at UCL, and he helped me greatly with the project as he specialises in Software Systems Engineering. I felt that I still needed an “industry” contact, in the specific field I am dealing with. Luckily I managed to find a family friend who was able to help. His name is Graham Skelton and he currently is the CEO of CompleteTV45, an IPTV service provider in the UK. His assistance to me proved invaluable for verifying market information. Page 26 of 75 Mat Thomas S75245/BAFM1107 Chapter 5 5.1 Overview This section details the findings of the above research in greater detail. As previously mentioned, I conducted a questionnaire for people to respond to and the analysis of this data is provided below. 5.2 Results All of the following analyses are based on the results from the questionnaire. These can be found in the appendix (section 7.3.12) 5.2.1 Gender of Respondents The gender of respondents isnʼt really crucial in the analysis of the data, however I attempted to include roughly the same number of males as females. Unfortunately I received less replies from females than males and so this must be taken into consideration when attempting to re-map results from the sample population to the general population. 5.2.2 Location of Respondents As the entire research is split between Thailand and the UK; I am studying in Thailand and I will be moving back to the UK at the end of the course, location is a crucial factor in decoding the following results. Based on this I want to obtain data from both countries and attempt to link similarities and identify differences. I have successfully split the population sample close to 50% UK and 50% Thailand. I believe that this will aid my analysis as I can obtain a fair comparison. 5.2.3 Amount of movies watched (Hours per week) The respondents were asked how many hours per week on average they view movies, and specifically movies. At this stage no attempt was made to obtain data on where the movies were viewed and in what quality. I wanted this data as a base measure for analysing following questions. It seemed that I managed to get a fair spread of results with some people even putting “0” as an answer. This surprised me as “0” is very definite. In my opinion itʼs the same as saying “I do not ever watch movies”. This maybe true however I doubt it. Regardless, the average was 7 hours per week, or between 4 and 5 movies a week, which seems about right for me. Page 27 of 75 Mat Thomas S75245/BAFM1107 5.2.4 Viewing Method This question was one of my most crucial in the sense that it would give me information on the way in which people watched movies. Not surprisingly the easiest method was also the most used, TV. I believe TV is dominant because of its ease of use. It was encouraging to see that the next biggest choice was PC. A third of my sample population is far higher than I anticipated. It is also clear to me that going to the cinema is in clear decline, however to take one fifth of the results isnʼt surprising as it is an established viewing method with features not found elsewhere (giant screen etc.). Now that I had established how much people watched and how they watched it I wanted to probe deeper into my field of research, specifically HD content. For proper analysis of the following information we have to take into account the location of the respondents as the availability of services is different. Another factor is the “normal” prices paid for movies will be different as each country has a different purchasing power parity. The method I learnt for easily identifying PPP was that of comparing prices of a McDonalds Big Mac in each country, although it initially sounds stupid it is actually a promoted method by The Economist46. Based on this I have found there to be a ratio of 67B to £2.29, or an adjustment (based on exchange rates) of 120B/67B=1.79. This means that any monetary value given by people in Thailand needs to be increased by 79% to have any sort of meaningful comparison. 5.2.5 Amount willing to pay As detailed above, the figures are corrected to allow for purchasing power differences from each country. There were only 2 people who put a figure of “0฿”, which is best explained by them not owning an HDTV. It makes sense that if you have no TV capable of displaying HD material then why pay a premium for it? It should also be noted that many (about 60%) of the respondents said that the amount they would be willing to pay depended entirely on what content would be provided. Many people argued that if they were unable to obtain new movies then the service wouldnʼt be worth it. One person even went so far as to say that if it was available illegally then “Iʼd probably just download it”. This reassured my thought that the key to providing a successful service is content. Not one person commented on the data-rates or the codecs or even the resolution of “HD” material being a deciding factor. This disappointed me somewhat however it also helped me learn that this is probably due to the viewers expecting a certain level of quality anyway, as soon as you mention “HD”. Page 28 of 75 Mat Thomas S75245/BAFM1107 5.2.6 Own an HDTV The number of people owning an HDTV was re-assuring for me as over 2/3 of people said “yes”. The interesting analysis of this was to compare the number of people with an HDTV and the following result of the number of people with HD playback equipment: 5.2.7 Own HDTV Equipment The question was asked; do you own any HD playback equipment. The results are surprising when compared to the HDTV results above as it shows that while 2/3 of people have an HDTV, only 50% have anything to enable HDTV viewing. In my opinion this is due to two factors; TV technology has increased at a faster rate than the availability of HD playback devices and people are attempting to “future-proof” their new TV purchase by buying above their needs. A TV is a substantial purchase, one which many people wonʼt replace within a decade or so. With all the buzz and hype surrounding HDTV, many consumers believe that they might need the HDTV functionality in the future, regardless of their needs/habits. I also believe that this 20% difference is one of the groups of people which my research attempting to target, the other being power-users. 5.2.8 Amount of HD Watched (per week) This question was once of the most important to my field of research, in as much that it would hopefully tell me how much people watched HD material currently. There are two ways to interpret this data, one of which is to say that over half of the sample population do not watch any HD material at all, therefor there is not much of a market to attempt to penetrate. The other way of viewing this is that with the right product there are a lot of people who are open to the idea of HD content. The best way to check to see if this hypothesis is correct is to select only the people who answered “0” for this question, and look at the amount they would be willing to pay for HD content. 5.2.9 Amount Willing to Pay if no HD Viewed Although the above result was fairly ambiguous, the chart shows that although some people didnʼt watch any HD movies at all, only 18% of these wouldnʼt pay for the proposed HD service. It was quite encouraging to see that the average amount these people were willing to spend was around 1000฿ v.s. 1400฿ for people who know the benefits of HD content. Page 29 of 75 Mat Thomas S75245/BAFM1107 5.3 Project Activity 5.3.1 Pre-Buffer Data-Flow Based on the results of the questionnaire and my previous experiences/learning's, it was apparent to me that people didnʼt really care how their movies were broadcast to them, or even the differences in HD quality the movies themselves were presented in. Because of this I decided to design the system with this in mind. The method of this is described in a detailed flowchart in the appendix section, however essentially the system will test the clients internet connection and adapt both the method and buffer levels to ensure flawless playback of the media. I have also allowed for 10% extra on both the buffer and the remainder just in case the conditions of transfer change. The way this works is best illustrated with the flowchart (appendix 7.3.13) and an example: Movie length: 154 minutes File-size: 4475 megabytes Internet connection: 2 megabits/second File pieces: buffer (including 5% test) + remainder The system will test the internet connection by downloading 5% of the movie from central servers (with potentially infinite bandwidth). The system will then calculate based on this how much is needed to buffer in order to provide flawless playback, in this case 55% or 2461 megabytes. This is calculated by the time taken to download the remainder, as long as the following equation is satisfied then the movie can be watched: [((154 minutes x 60 x 2 megabits/sec) + 10%) > 2014 megabytes] or [((movie length x 60 x maximum internet speed) + 10%) > remainder file-size] The “x 60” is to convert the movie time into seconds (to match the internet speed) The “+ 10%” is to allow for variations in the maximum internet speed. This is best explained by a phenomenon called “network overheads”. Overheads are actually around 9.43% 47, however I wanted the system to allow for differences in connections and 10% satisfied this. Page 30 of 75 Mat Thomas S75245/BAFM1107 5.3.2 Results of the Tests Based on the above I ran 8 tests in an attempt to establish minimum data rates and number of movies available to watch on various connections per day. The results from these tests in a broken-down format are available in the appendices. The results were very encouraging as they were not far from what I had calculated; I had previously calculated how much buffer each connection would need in optimum conditions. One of the main purposes of the test was that as we all know the internet and IP networks in general never run at full speeds and unexpected slowdowns do occur. I will be focussing on the actual results from the tests in this section. Something which I predicted was that running the model on a 1 megabit connection would provide disappointing results with regards to the number of movies available for playback each day. By researching the average movie length48, I used a film which was close to the 129 minute average (Matrix Revolutions 49) and left the test machines running. The results from this test from the viewing clients are available in the appendices (section 7.3.14), however I will explain the findings here: 1 megabit connection On a 1 megabit connection, the test movie (for average length) required 393 minutes to buffer and a further 106 minutes to play, which is under the 129 minute movie length. In reality this means that for one movie (of average length) on a 1 megabit connection to be ready it takes about 6 1/2 hours to buffer. I expected a figure of around 371 minutes, or 6 hours 11 minutes. Already I was seeing a difference from what the hardware/networking should be providing and what it really was. If we take the viewing hours of people from 6pm - 12am then this leaves us with 18 hours left to use (from 12am -> 6pm) then we can only fit in 2 fully watch-able movies, as 3 would just push over the limit. As this is borderline then I would anticipate that 3 movies could potentially be buffered, depending on exact viewing habits and internet usage on a 1 megabit connection. Page 31 of 75 Mat Thomas S75245/BAFM1107 2 megabit connection On a 2 megabit connection the results were rather surprising. I had expected pretty much half the time taken and around double the amount of movies available for viewing after a days test. Instead I found that it took 143 minutes to fully buffer the test film. This is 2 hours and 23 minutes, much less than 1/2 of the time taken on the 1 megabit test. The reasoning for this is that I had allowed for half the time on the buffer and half the time on the remainder, however in reality it is not that simple as the remainder time always needs to be 10% less than the run time of the movie. What this means is that on a 2mbit connection we are able to buffer much less and still retain “playability”. This of course means that within the 18 hour window (as explained above) we can potentially have 7 movies available to view at the end of the day, far more than required for one dayʼs viewing. With this in mind I predicted that the following (4 megabit) test would far exceed my expectations in terms of the number of available movies. 4 megabit connection The 4 megabit connection tests yielded some amazing results. The average length movie I was using as a test was fully ready to be watched in just 18 minutes. A 4 megabit connection is not uncommon in either Thailand or the UK (my places of interest) and so this result really gave me confidence that my methods were working. In theory it is possible, using this system, to have 60 movies ready to watch per day. This is obviously a lot more than is possible to watch in a day, however the potential for choice is dramatically increased. The algorithm which I had worked so hard on previously seemed to become fairly obsolete; I had originally designed it to select which films to buffer in order to select relevant movies based on the viewers preferences. The reasoning for this was that I thought that the system would only be able to handle a few movies per day, however based on this test it becomes clear that it is possible to have many movies ready for playback, in order to give people a greater choice. The selection algorithm is still relevant because it will still rank movies which the user might enjoy higher than others, just the amount of movies available for play has significantly increased. Page 32 of 75 Mat Thomas S75245/BAFM1107 8 megabit connection The previous tests had affirmed my calculations regarding content delivery, however I did not know this before the tests ran. At this point as the 4 megabit test had provided so many available movies, the 8 megabit test wasnʼt really necessary for me, however the test set-up had already ran. The results actually aided my research greatly as all of the tests failed completely. The results I was getting were throwing back errors to do with “unable to connect to host” after the initial 5%. What this meant is that the clients were unable to connect to the host computers (and other clients using the P2P network). I was shocked with this as all other tests had run completely and without many problems. I had to investigate the problem further and what I learnt was that there was a problem with the calculation used to decide on how much to buffer. I had to run the test manually and observe all of the machines and the instructions given to them. What was happening is that the buffer size was being calculated from the first 5% of the download. On an 8 megabit connection (after taking network overheads automatically into account) the buffer size was actually negative as the connection was so fast. What this meant was that the clients were attempting to actually “reverse buffer”, or send data which they didnʼt have back to the server. This is explained numerically in the appendices, however obviously this was an error in the algorithm I was using to determine buffer size. On one of the tests the client was attempting to send back around 70% of the movie (which it didnʼt have yet). Once I had corrected the formula used to calculate the buffer amount I re-ran the test. The 8 megabit connection didnʼt actually require any buffering at all, which meant that when I attempted to play the movie, there was around a 2 minute delay before it would play, best explained by the use of file-headers in the movie files I was using. This is normal for most computer data files as the headers contain important information about the rest of the file, however it presented a problem for me as the entire project has been designed with “flawless playback” in mind. I decided to re-write the formula once again to make sure that regardless of the initial 5% buffer (used to test the clients connection speed) it would always download an additional 5%, to give a total of 10% minimum buffer amount. Once this test was run the results were on-track with what I was expecting; instant start without any breaks in viewing. Page 33 of 75 Mat Thomas S75245/BAFM1107 5.3.3 Summary of Results Based on the above section I have drawn certain conclusions which aided me in my final test. 1. The system will work on any connection as long as it is at least 1 megabit 2. Only with an 8 megabit connection can we have a service close to true “on-demand” 3. Any connection in between will allow people to have a “pseudo-on-demand” service (where previously none was available) 4. The algorithms and calculations are not perfect and will need constant attention to ensure as few bugs as possible 5. The P2P network, even on a controlled local network, under-performed by around half compared to what I had expected 6. All tests provided results which would satisfy both the average (7 hours / 3-4 movies) “number of movies viewed per week” and the maximum (18 hours / 8-9 movies), gathered from the earlier questionnaire 7. Another note is that the longer the movie being buffered is, the less time is taken to buffer. This is because there is more time to download the remainder while the movie is being watched 5.3.4 Final Test For the final test, I decided to take all of the above into account, whilst testing the system with a longer (and therefor larger file-size) film. The film I used to test was “Pulp Fiction” 50. I used this film as it is 154 minutes (2 1/2 hours) long, which is slightly longer than the average quoted above, in order to test the system with a larger file-size movie. I had wanted to use an even longer movie, “Titanic” (194 mins / 3 hours 15 minutes) for example, however I didnʼt have an HD version of it to test with. The reasoning for using a “Hollywood” film as opposed to indie content was that I personally know the film as I have watched it many times over and they are usually available in HD. This in turn allows me to easily notice any flaws in playback, be it smoothness, colours, audio/video sync issues etc. I ran the final test on 4 different occasions, one for each simulated internet speed. Each test was left to run from around midnight one night until around 6pm the following evening. As expected, each of the tests completed successfully without any problems, as major problems had already been identified in the earlier tests. For 4 successive nights I watched Pulp Fiction from start to finish without any interruptions. The results of the tests were extremely pleasing as I had achieved what I had set out to do. Further details can be seen in section 7.3.16. Page 34 of 75 Mat Thomas S75245/BAFM1107 5.4 Problems/Limitations of the system From both my own observations and informal feedback from people I have discussed the project with, there are certain issues which arise: 5.4.1 Power Usage As the system has been designed to be running 24 hours per day there is an obvious concern over power usage. In recent years, especially in the UK, both government and private organisations have been lobbying for people to not waste as much energy as possible51 . This includes a basic step of unplugging all devices which are in “standby” mode when not in use, especially over-night. This causes significant problems for the structure of my system as it relies on using this “downtime” in order to function correctly. Essentially the system works by matching peoples down-time with the systemʼs up-time, If people switch off the system during their down-time, the entire system will not work. A possible work around to this is to clearly inform people to not power down the device at all. For this to work in todayʼs society, the system would need to be as efficient as possible with regards to electricity usage. As I am not manufacturing the hardware for the system this is outside of the scope of the project, however it is a genuine concern to be aware of. 5.4.2 Internet Connection Usage The system has been designed to use only the idle portion of the internet connection. This is partly achieved by the initial 5% speed test and partly by using QoS 52 (quality of service) rules. By telling the internet gateway used (usually an ADSL/Cable modem/router) that the system is of the lowest priority, all other internet traffic becomes a higher priority. For example if the system is using 100% of the users internet connection and at the same time the user checks their e-mail, the system will allocate the required amount to the e-mail application so as to not interfere with the usual use of the connection. The problem with this is that quite a few ISPʼs (internet service providers) will have a monthly limit on the amount of data transferred. The amount of data the system uses is extremely high, due to the nature of HD video. Users would have to be aware of this as they might run in to problems with their ISP otherwise. For any situations where the system is used in a closed circuit (hotel, company LAN etc.) this no longer becomes an issue. The project is focussed on the ability to provide a Page 35 of 75 Mat Thomas S75245/BAFM1107 pseudo-on-demand service through any connection (over 1 megabit) and not with additional limitations imposed by external companies. A potential solution to this is that either the ISP makes an arrangement with the IPTV service provider (to not include the transfers in the monthly data count), or simply the ISP becomes the provider of the IPTV service. As this project isnʼt concerned with business agreements/licensing it will not be discussed further, nevertheless it is an issue which arose while obtaining feedback from one of my industry contacts. 5.4.3 Fast-forwarding While testing the system it became apparent that one useful (and basic) feature of in-home entertainment was unable to work in the system. Fast-forwarding, or skipping through the movie, will not work due to the way the system has been designed. As there is no data past a certain point (as it is still being downloaded) fast forwarding is not possible. I tried to establish a point at which fast forwarding would work; when the entire movie has been downloaded fully, however this usually indicates that the movie has already been watched. I have not built any kind of fast-forward criteria into the system due to time constraints, however initially I believe the simplest solution is to not allow fast forwarding at all. This falls in line with my personal preference of wanting to watch a movie from start to finish in its entirety, much the same as when viewing a film in a cinema. The ability to pause the movie has been allowed in the current model of the system as this helps the amount of downloaded data to be increased and therefor it only aids the end goal of playing back without network/data interruptions. 5.4.4 Licensing Based on the earlier questionnaire and feedback from industry professionals, the single most limiting factor to the commercial success of this project is that of licensing. As previously mentioned by a questionnaire respondent, “content is king”. All of the testing has been done in private and with “educational licence” in mind, however in the public sector, licences would need to be obtained for content, and obtaining more popular content potentially means more people would be interested in investing into the service. Page 36 of 75 Mat Thomas S75245/BAFM1107 5.5 Potential Benefits of the System 5.5.1 Advertising As the system relies on the media file to be split up into parts for the distribution method, it is potentially easy to add in advertising or previews of other movies to the beginning of the requested content. This also means that the “negative” aspect of losing the ability to fast forward, as detailed above, can be turned into a “positive” as advertisers would be more inclined to invest into a medium in which they can get guaranteed views. The possibility for the aforementioned 5% “bandwidth test” to be paid for by advertising also presents us with a neat solution; the data costs of the servers bandwidth would be offset with the revenues from advertisers. This is of course a business bonus, rather than a research bonus, but it is extremely relevant when looking towards the future of the project. 5.5.2 Content Rights Although the system has been used with a single codec, based on the specialised application part of the course, it is possible to modify the files properties or even change the codec method altogether. As long as the codec used is “sequential”, that is the beginning of the movie is at the beginning of the file and the end of the movie is at the end of the file, it will work with the system. Because of this, it is possible, although not necessary, to include content management systems within the files being distributed. In a real-life scenario this allows for far greater control of where, how and when the content is allowed to be viewed. Personally I do not really agree with the limiting systems, but unfortunately many of the major studios do. It is possible to allow multiple options for viewing based on the users needs; renting, buying or blanket licence. Renting movies would be set-up in such a way that the user “pays per view” of a movie, usually with certain restrictions on the availability of the content, licence expiry after 3 days for example (much like the traditional “BlockBuster”53 model). With the purchasing method, the user would essentially buy a licence to view the content, usually with restrictions on where they can playback (much like a Blu-Ray Disc54 ). The method which I favour is the “blanket licence” model. This basically means that while a user subscribes to the service they are allowed to view as much content as they wish. There would still usually be restrictions in force, for example if a user stopped paying for it, they would loose their privileges to view the content. This model is currently employed by Sky 55, a UK based satellite company. Page 37 of 75 Mat Thomas S75245/BAFM1107 5.5.3 Indie Producers There is room in the system to allow users themselves to add content to the library, similar to YouTube56 . This would mean that producers of independent content are able to enjoy the same benefits studio productions do; retaining high quality picture and sound all the way to the viewer. Also with the selection algorithm employed in the project, indie movies which people wouldnʼt usually watch might get more attention than normal. If we look at another part of the industry where this is already happening, itʼs clear to see that recently (within the last 2 years), independent producers of all sorts are breaking into the market much more than previously. A specific example is that of the “AppStore57 ” from Apple Computers 58. Apple have opened up a market which is populated by both large companies and independent producers of software applications, specifically for use on their iPhone59/iPodʼs 60. This model was challenged by many industry professionals, however there have been many “success stories” of independent producers creating popular applications for the general public61 . One example of this is an application called “Trism” by Steve Demeter62 . The article claims that he made a profit of around $250,000 in 2 months after releasing the application. Although this is based on software, the principles are the same; users are buying a licence to use the application which was made by an independent developer all conducted within a trusted marketplace. 5.5.4 Expandability The client systems retain the ability to be updated by the servers at any time. For example if a bug is found in the client, all other clients can be updated with a bug-fix, with no user interaction whatsoever. This is something which has not been included in this project as I felt it was “overkill” as I was sat next to all of the clients and due to the virtual machine nature of the test-bed. 5.5.5 Data Amounts The system was designed principally with the aim to playback HD content whilst adapting to current networking limitations. One off-shoot of this is that the load placed on the central server was greatly reduced, when compared to the total amount of data requested from all of the clients. While each test film was 3750mb, and being sent to 12 clients, we could expect a total of 45,000mb (45gb) to have been sent from the server. However due to the peer to peer aspect of the system this figure was around 15,000mb (15gb).for the initial tests. The remainder of the data was sent through the peer to peer network, significantly reducing the load on the server by a factor of 3. Page 38 of 75 Mat Thomas S75245/BAFM1107 Chapter 6 6.1 Conclusions If we refer back to the original statement, “Design and Testing of On-Demand Distributed Content Models for Distribution of High Definition Media Over IP Networks” clearly as long as basic criteria are satisfied, the project is a “success”. I designed and tested an on-demand model for HD media over IP networks. In this right the project was successful in proving that it is at least possible. However I had set out some further criteria to allow me to provide a conclusion; the playback of the content must be in high-definition (this was based on my specialised application work and the questionnaire) and it must be “flawless” e.g. no breaks in viewing. Based on the analysis of the project given above I can say with certainty that I have achieved these goals. Based on the questionnaire, people are not that “fussy” when it comes to how “high definition” something really is. I believe that if you simply told someone the content they were watching was presented in HD, when in reality it was simply up-scaled SD material, 95% of the time they would believe you. Regardless of this I had certain standards to uphold and the results from the earlier tests revealed that HD content neednʼt take up masses of data. The perceived quality was still indistinguishable from uncompressed HD footage. Once the initial calculations were complete it was difficult to tell if the project would work correctly under “real-life” situations, however as previously mentioned it exceeded my expectations. Once a few bugs had been addressed, the playback of content was perfect, both the high definition video and Dolby Digital 5.163 sound. Although both audio and video streams are compressed, the majority of users will be used to this from previous formats (DVD, Blu-Ray etc.) and furthermore already have the hardware needed for playback. In conclusion, I feel that the only set-back to this project being taken further is the issue of licensing. As we have established that “content is king”, without any “decent” content for viewers to watch, why would they invest in service such as this? Page 39 of 75 Mat Thomas S75245/BAFM1107 6.2 Recommendations Working with such technical data on a project similar to this raises many problems and only the most technically-comfortable of individuals should attempt it. I have had nearly 10 years being actively involved in the IT industry and yet still found this project challenging. Ideally for anyone wanting to do similar research to me I would recommend them to first read this paper and ideally be able to understand the principle data flowʼs and algorithms. Once that is established some good points of reference would include the book “The Technology of Video & Audio Streaming” by David Austerberry 64. This book has provided an invaluable information source, however as industries move at a fast pace I would recommend to obtain the most up to date version along with supplementing this information with external sources. I would also recommend that people who wish to undertake a project similar to this focus on simply one country or location. I decided to focus on two locations because I was living in Thailand and was aiming to move back to the UK after my studies had completed. Although I thought at the time this was the best choice, it caused quite a few problems during my research and data-gathering; working with different time-zones is something to be wary of, along with varying levels of technology and finally when comparing monetary figures it is difficult to provide an accurate analysis due to the exchange rate being so volatile during my year of study. Although I was happy with my final results from the questionnaire, I would strongly recommend that in future, researchers attempt to gain a larger sample population. It was difficult co-ordinating across countries and I still feel that 21 people wasnʼt enough. As previously mentioned, data accuracy was an issue, however I believe had I asked people in the street I might have got more accurate results than my initial questionnaire. With regards to the testing methods employed, I would recommend future researchers attempt to have a more wide-spread testing set-up. I tried my best to simulate high demand however there is only one of me and so viewing 12 separate video streams checking for any problems in playback was impossible. I attempted to judge this from playback logs however what a program logs and what it really does is sometimes quite different. This could be achieved by perhaps setting up 3 or more viewing stations and requesting people use the system simultaneously and check for a flawed playback. Finally as previously mentioned the amount of content and the source of the content is a high priority both for testing and for any future development of the project. It would be advisable to do a full test with an “industry standard” file with DRM included (for example from the iTunes HD Movie Store) however these files are not made public at this stage, usable only with an AppleTV6566 . I feel it is important to design the system with as few limitations as possible so as to be able to expand it in the future (music/ game distribution etc). Page 40 of 75 Mat Thomas S75245/BAFM1107 Chapter 7 7.1 Endnotes 1 http://www.engadget.com/2008/03/19/fcc-redefines-broadband-to-mean-768kbps-fast-to-mean-kinda/ 2 http://www.roughlydrafted.com/RD/RDM.Tech.Q2.07/3FE4864A-FC79-4EAD-BCB3-45C0B0C830BD.html 3 http://cityroom.blogs.nytimes.com/2007/11/14/off-goes-the-power-current-started-by-thomas-edison/ 4 http://www.socialmedia.biz/2006/12/convergence_and.html 5 http://www.buzzle.com/editorials/4-10-2005-68350.asp 6 http://dictionary.reference.com/browse/model 7 http://www.dummies.com/how-to/content/entering-the-world-of-hdtv.html 8 http://www.starlancs.com/EducateMe/educate_IP_stack.html 9 http://www.thefreedictionary.com/network 10http://www.telechoice.com/bios_dbriere.asp 11 http://www.telechoice.com/bios_dbriere.asp 12 http://www.telechoice.com/default.asp 13 http://books.google.com/books?id=YY41cKhmsjYC 14 Briere & Hurley, 2007, P86 15 http://broadcastengineering.com/david-austerberry-editor/ 16 http://www.intel.com/technology/mooreslaw/index.htm 17 http://www.dti.gov.uk/files/file13434.pdf 18 http://www.opsi.gov.uk/Acts/acts1998/ukpga_19980029_en_1 19 http://www.1st-free-music-download.com/napster.html 20 http://en.wikipedia.org/wiki/File:Standard_video_res.svg 21 http://education.jlab.org/beamsactivity/6thgrade/vocabulary/index.html 22 http://www.vmware.com/ 23 http://www.bolton.ac.uk 24 http://data.bolton.ac.uk/bissto/researchskills/research_method/data_collection/triangulation.htm 25 http://www.statistics.gov.uk/cci/nugget.asp?id=1659 26 http://www.media-awareness.ca/english/resources/research_documents/statistics/television/ tv_viewing_habits.cfm 27 http://brent.tvu.ac.uk/dissguide/hm1u3/hm1u3text3.htm 28 http://forum.doom9.org/ 29 http://www.mariposahd.tv/ 30 http://www.imdb.com/title/tt0110912/ 31 http://www.intel.com/products/processor/pentium4/index.htm Page 41 of 75 Mat Thomas S75245/BAFM1107 32 http://www.smc.com/index.cfm?event=viewProduct&cid=6&scid=24&localeCode=EN_USA&pid=1147 33 http://www.vmware.com/ 34 http://www.microsoft.com/windowsxp 35 http://filezilla-project.org/ 36 ibid. 37 http://www.bitcomet.com/tools/tracker/index.htm 38 http://azureus.sourceforge.net/ 39 http://www.net-peeker.com/ 40 http://www.bitcomet.com/tools/tracker/index.htm 41 http://forums.afterdawn.com/ 42 http://forum.doom9.org/ 43 http://www.cs.cornell.edu/Courses/cs482/2003su/handouts/pseudocode.pdf 44 http://www.cs.ucl.ac.uk/staff/a.finkelstein/ 45 http://www.completetv.com/ 46 http://www.economist.com/markets/bigmac/ 47 http://pflog.net/dsl_overhead/ 48 http://www.infinitypoint0.com/60/imdb-film-length-project/ 49 http://www.imdb.com/title/tt0242653/ 50 http://www.imdb.com/title/tt0110912/ 51 http://www.energysavingtrust.org.uk/What-can-I-do-today/Britain-Unplugged 52 http://www.cisco.com/en/US/docs/internetworking/technology/handbook/QoS.html 53 http://www.blockbuster.co.uk/ 54 http://www.blu-ray.com/ 55 http://www.sky.com/ 56 http://www.youtube.com 57 http://www.apple.com/iphone/features/appstore.html 58 http://www.apple.com 59 http://www.apple.com/iphone/ 60 http://www.apple.com/ipodtouch/ 61 http://blog.wired.com/gadgets/2008/09/indie-developer.html 62 http://blog.wired.com/gadgets/2008/09/app-store-is-a.html 63 http://www.dolby.com/consumer/technology/dolby_digital.html 64 Austerberry, David “The Technology of Video & Audio Streaming” 2nd Ed. Focal Press 2005 65 http://www.apple.com/uk/appletv/ 66 http://news.cnet.com/8301-17938_105-9850941-1.html Page 42 of 75 Mat Thomas S75245/BAFM1107 7.2 References 1st Free Music Download. “History of Napster Technology” 1st-free-music-download.com. Accessed on 2nd January 2009. AfterDawn. “AfterDawn Forums” afterdawn.com. Accessed on 2nd January 2009. Amazon.com. “Matrix Revolutions (2003)” imdb.com. Accessed on 2nd January 2009. Amazon.com. “Pulp Fiction (1994)” imdb.com. Accessed on 2nd January 2009. American Psychological Association (APA). “model. (n.d.).” Webster's Revised Unabridged Dictionary. Retrieved 2nd January, 2009. Apple Computers. “AppStore” apple.com. Accessed on 2nd January 2009. Apple Computers. “Apple” apple.com. Accessed on 2nd January 2009. Apple Computers. “iPhone” apple.com. Accessed on 2nd January 2009. Apple Computers. “iPod Touch” apple.com. Accessed on 2nd January 2009. Apple Computers. “AppleTV” apple.com. Accessed on 2nd January 2009. Austerberry, David. “The Technology of Video & Audio Streaming” 2nd Ed. Focal Press. 2005. Bell, Donald. “iTunes HD Movie Rental only for Apple TV?” cnet.com. Accessed on 2nd January 2009. BitComet. “A Free BitTorrent Client” bitcomet.com. Accessed on 2nd January 2009. Blockbuster. “Blockbuster” blockbuster.co.uk. Accessed on 2nd January 2009. Blu-Ray. “Blu-Ray” blu-ray.com. Accessed on 2nd January 2009. Briere, Danny and Pat Hurley. “HDTV for Dummies” Wiley Publishing, 2007. BSkyB. “Sky TV” sky.com. Accessed on 2nd January 2009. Chen, Brian X. “iPhone Developers go from Rags to Riches” wired.com. Accessed on 2nd January 2009. Cisco. “Quality of Service” cisco.com. Accessed on 2nd January 2009. Collins English Dictionary. “network. (n.d.)” Collins Essential English Dictionary 2nd Edition. (2004, 2006). Retrieved 2nd January 2009. CompleteTV. “CompleteTV” completetv.com. Accessed on 2nd January 2009. Cornell University. “Pseudo-code” cs.cornell.edu. Accessed on 2nd January 2009. Department of Culture, The. “Lifestyles” statistics.gov.uk. Accessed on 2nd January 2009. Department of Trade & Industry. “Connecting the UK: the Digital Strategy” dti.gov.uk. Accessed on 2nd January 2009. Dilger, Daniel Eran. “Movie Studios vs. Consumers in Home Theater” Roughly Drafted. Accessed on 2nd January 2009. Dolby Labs. “Dolby Digital” dolby.com. Accessed on 2nd January 2009. Economist, The. “Big Mac Index” economist.com. Accessed on 2nd January 2009. Energy Saving Trust. “Britain Unplugged” energysavingtrust.org.uk. Accessed on 2nd January 2009. FileZilla. “The Free FTP Solution” filezilla-project.org. Accessed on 2nd January 2009. “Hellisp”. “Standard Video Resolutions” wikipedia.org. Accessed on 2nd January 2009. Page 43 of 75 Mat Thomas S75245/BAFM1107 Hurley, Patrick J. “A Concise Introduction to Logic” Thomson/Wadsworth, 2006. Infinity point 0. “Film Length Project” infinitypoint0.com. Accessed on 2nd January 2009. Intel. “Moores Law” intel.com. Accessed on 2nd January 2009. Intel. “Pentium 4 Processor” intel.com. Accessed on 2nd January 2009. Jelsoft Enterprises. “Doom9ʼs Forum” doom9.org. Accessed on 2nd January 2009. Jin, Ming. “Net-Peeker” net-peeker.com. Accessed on 2nd January 2009. Lasica, J.D. “Convergence & Cultural Change” Social Media. Accessed on 2nd January 2009. Lee, Jennifer. “Off Goes the Power Current Started by Thomas Edison” New York Times. Accessed on 2nd January 2009. Media Awareness Network. “Statistics on TV Viewing Habits (1994-2000)” media-awareness.ca. Accessed on 2nd January 2009. Microsoft. “Windows XP” microsoft.com. Accessed on 2nd January 2009. Motamorfosis Productions. “Mariposa HD” mariposahd.tv. Accessed on 2nd January 2009. Office of the Public Sector Information. “Data Protection Act 1998” opsi.gov.uk. Accessed on 2nd January 2009. Parekh, Nilesh. “Software Testing - White Box Testing Strategy” Buzzle. Accessed on 2nd January 2009. Penton Media. “David Austerberry” broadcastengineering.com. Accessed on 2nd January 2009. pflog. “DSL Overheads” pflog.net. Accessed on 2nd January 2009. SMC Networks Inc. “SMC8508T Switch” smc.com. Accessed on 2nd January 2009. Sorrel, Charlie. “AppStore is a Goldmine” wired.com. Accessed on 2nd January 2009. Sourceforge. “Azureus BitTorrent Client” sourceforge.net. Accessed on 2nd January 2009. Sparrow, Andrew. “Film & TV Distribution and the Internet” Gower, 2007. Stallings, William. “Computer Organization & Architecture” Prentice Hall, 2003. StarLAN. “TCP/IP Stack Explained” starlancs.com. Accessed on 2nd January 2009. TeleChoice. “Danny Briere” telechoice.com. Accessed on 2nd January 2009. Thames Valley University. “Primary Data Collection Methods” tvu.ac.uk. Accessed on 2nd January 2009. Thomas Jefferson National Accelerator Facility. “Vocab List” education.jlab.org. Accessed on 2nd January 2009. University College London. “Dr. Anthony Finkelstein” cs.ucl.ac.uk. Accessed on 2nd January 2009. University of Bolton, The. “Triangulation” bolton.ac.uk. Accessed on 2nd January 2009. VMWare. “VMWare: Virtual Machines” vmware.com. Accessed on 2nd January 2009. Wiley Publishing. “Entering the World of HDTV” Dummies.com. Accessed on 2nd January 2009. YouTube. “YouTube” youtube.com. Accessed on 2nd January 2009. Page 44 of 75 Mat Thomas S75245/BAFM1107 7.3 Appendices 7.3.1 Scan from “HDTV for Dummies” page 86 7.3.2 SD/HD Resolution Comparison Table Source: http://en.wikipedia.org/wiki/File:Standard_video_res.svg Page 45 of 75 Mat Thomas S75245/BAFM1107 7.3.3 Course Flowchart Page 46 of 75 Mat Thomas S75245/BAFM1107 7.3.4 Example Questionnaire Hey guys, reckon you could help me out with my research and answer this quick questionnaire? Please be honest, it's all anonymous and not a competition! Feel free to add options if nothing suits you (UK people just put the prices in pounds) This is all about MOVIES, not TV series or anything else, just MOVIES: How many hours per week do you watch movies? | 0-2 hours | 2-5 hours | 5-10 hours | 10-15 hours | What method do you primarily use (Cinema, TV, PC, PS3, PSP, Mobile Phone...) | Cinema | TV | PC/Laptop | PS3/Xbox | PSP/PMP | Mobile Phone | Do you own an HDTV? | Yes | No | Do you own any equipment which plays HD? (PC, Blu-Ray player, PS3 etc.) | Yes | No | How often do you watch HD content? (Hours per week) | 0-2 hours | 2-5 hours | 5-10 hours | 10-15 hours | N/A | How much would you pay (per month) for unlimited HD content on your device of choice? | 0฿ | 500฿ | 1000฿ | 1500฿ | 2000฿ | More than 2000฿ Thank you for your time, Mat. Page 47 of 75 Mat Thomas S75245/BAFM1107 7.3.5 Project Schedule Month Project Activity January February March Write Proposal Research/analyze Research/ viewing trends develop virtual Research virtual test systems machine methods/ limitations Sideline Activity Download 40 hours of 1080p material Deadlines 4th - Proposal Due 31st - Have Systems in Place Month July August April Run a test system and check for accuracy September 3rd week Begin first test October Collect results Prepare final from last 4 test solution, refine runs, analyze and note findings. Collect results Refine from final test, research analyze, note results findings and conclude Sideline Activity Download further material for final test Work placement Deadlines 31st - propose 2nd week - run a final solution proposed test for for further 2 weeks testing 30th Complete findings 3rd week 1st draft of complete paper June Collect results from Collect results from first test run, check first 4 test runs, accuracy analyze and note findings. Convert 1080p (1 week off material into for Songkran other formats Festival) Project Activity Work placement May (Last week off for birthday) 1st week - 2nd test 1st week - 6th test 2nd week - 3rd test 2nd week - 7th test 3rd week - 4th test 3rd week - 8th test 4th week - 5th test November December Refine final paper Prepare for presentation 26th - Completed Research Research & Project Presentation Page 48 of 75 Mat Thomas S75245/BAFM1107 7.3.6 Hardware Testing Picture taken on 12/11/08 Page 49 of 75 Mat Thomas S75245/BAFM1107 7.3.7.1 Server - Installing Windows XP Page 50 of 75 Mat Thomas S75245/BAFM1107 7.3.7.2 Server - Network Speed 1 Gbps = 1024 Mbps Page 51 of 75 Mat Thomas S75245/BAFM1107 7.3.7.3 Server - Network Analyser n.b. the only process using bandwidth is the peer to peer tracker Page 52 of 75 Mat Thomas S75245/BAFM1107 7.3.7.4 Server Configuration Installing IIS (Internet Information Services) to allow for FTP server Page 53 of 75 Mat Thomas S75245/BAFM1107 Installing peer to peer network services Page 54 of 75 Mat Thomas S75245/BAFM1107 7.3.7.5 Server - Splitting of Media File Total size: 4475mb Split size: 89.5mb Number of files 100 The split files allow for one piece to equal 1% of the total file, to allow for easier allocation based on the client network speed. Page 55 of 75 Mat Thomas S75245/BAFM1107 7.3.8.1 Client - Network Analyser n.b. “Process 1396” in green is VNC, a screen sharing application. 12.56K/sec is around 15% of the available bandwidth on a 1mbit connection. Page 56 of 75 Mat Thomas S75245/BAFM1107 7.3.8.2 Client - Simulating ADSL Bandwidth n.b. KB/s x 8 = kbps Upload: 64KB/s = 512kbps Download: 256KB/s = 2048kbps (2 megabit) Page 57 of 75 Mat Thomas S75245/BAFM1107 7.3.9.1 Viewer - Connection to Server Page 58 of 75 Mat Thomas S75245/BAFM1107 7.3.9.2 Viewer - Method Page 59 of 75 Mat Thomas S75245/BAFM1107 7.3.10 Viewing Bias Code All code is written in pseudocode, guidelines: http://www.cs.cornell.edu/Courses/cs482/2003su/handouts/pseudocode.pdf ‘Begin Code ‘---------‘Basics ‘-----Local [New] as Array(...) available Local [ForDL] as Array(...) Local [Score] as Integer Global [Match] as Array (...) ‘Signifies whether any new content is [New] [ForDL] [Score] [Match] ‘Cleaning out all variables for first use == == == == “” “” “” “” ‘Ranked list of what to download ‘Holds the “score” of each content ‘Keeps lists of potential matches OpenConnection(LOCAL) as Object Local.Authenticate ‘Connect to the local database ‘Authenticate with the local server OpenConnection(REMOTE) as Object Remote.Authenticate ‘Connect to the remote database ‘Authenticate with the remote server ‘Rankings ‘--------‘Genre (40%) ‘Actor (20%) ‘Writer (10%) ‘Director (10%) ‘Producer (10%) ‘Sound (5%) ‘DP (3%) WHILE i <= Remote.Count DO ‘Check all available movies on server IF Remote.Film(i).Genre = Local.Film.Genre(AVERAGE(MODE)) THEN [Match].Genre = Remote.Film(i) [Score] = [Score].Remote.Film(i) * 0.4 END IF IF Remote.Film(i).Actor = Local.Film.Actor(AVERAGE(MODE)) THEN [Match].Actor = Remote.Film(i) [Score] = [Score].Remote.Film(i) * 0.2 END IF IF Remote.Film(i).Writer = Local.Film.Writer(AVERAGE(MODE)) THEN [Match].Writer = Remote.Film(i) [Score] = [Score].Remote.Film(i) * 0.1 END IF IF Remote.Film(i).Director = Local.Film.Director(AVERAGE(MODE)) THEN [Match].Director = Remote.Film(i) [Score] = [Score].Remote.Film(i) * 0.1 END IF IF Remote.Film(i).Producer = Local.Film.Producer(AVERAGE(MODE)) THEN [Match].Producer = Remote.Film(i) [Score] = [Score].Remote.Film(i) * 0.1 END IF Page 60 of 75 Mat Thomas S75245/BAFM1107 IF Remote.Film(i).Sound = Local.Film.Sound(AVERAGE(MODE)) THEN [Match].Sound = Remote.Film(i) [Score] = [Score].Remote.Film(i) * 0.05 END IF IF Remote.Film(i).DP = Local.Film.DP(AVERAGE(MODE)) THEN [Match].DP = Remote.Film(i) [Score] = [Score].Remote.Film(i) * 0.05 END IF i++ END WHILE SWITCH CASE [Match].Genre EXISTS THEN [ForDL] = [ForDL] + [Match].Genre BREAK CASE [Match].Actor EXISTS THEN [ForDL] = [ForDL] + [Match].Actor BREAK CASE [Match].Writer EXISTS THEN [ForDL] = [ForDL] + [Match].Writer BREAK CASE [Match].Director EXISTS THEN [ForDL] = [ForDL] + [Match].Director BREAK CASE [Match].Producer EXISTS THEN [ForDL] = [ForDL] + [Match].Producer BREAK CASE [Match].Sound EXISTS THEN [ForDL] = [ForDL] + [Match].Sound BREAK CASE [Match].DP EXISTS THEN [ForDL] = [ForDL] + [Match].DP BREAK END SWITCH LOCAL.[ForDL] SORT BY [Score] LOCAL.Download [ForDL] ‘List the films but ranked by score ‘Download the films by score CloseConnection(LOCAL) CloseConnection(REMOTE) ‘End Code ‘-------- Page 61 of 75 Mat Thomas S75245/BAFM1107 7.3.11.1 Original Virtual Data Flow Diagram Client 2 Client 1 Server Client 3 VClient1 VClient2 VClient5 VClient6 VClient9 VClient10 VClient3 VClient4 VClient7 VClient8 VClient11 VClient12 1000mbit Switching Hub Viewing Client 7.3.11.2 Refined Virtual Data Flow Diagram Client 2 Client 1 Server Client 3 VClient1 VClient2 VClient5 VClient6 VClient9 VClient10 VClient3 VClient4 VClient7 VClient8 VClient11 VClient12 1000mbit Switching Hub Viewing Client Page 62 of 75 Mat Thomas S75245/BAFM1107 7.3.12.1 Questionnaire Results Page 63 of 75 Mat Thomas S75245/BAFM1107 7.3.12.2 Questionnaire Analysis Age Amount watched (hours/week) Gender Viewing Method Location Amount Willing To Pay Page 64 of 75 Mat Thomas Own an HDTV S75245/BAFM1107 Own HDTV Equipment Amount of HD Watched (hours/week) Amount Willing to pay if no HD viewed Page 65 of 75 Mat Thomas S75245/BAFM1107 7.3.13 Data Flow Decisions Page 66 of 75 Mat Thomas S75245/BAFM1107 7.3.14 Initial Testing Results Uploaded Uploaded (torrent) (FTP) Uploaded (netpeeker) (Upload total) Downloaded Downloaded (torrent) (FTP) Downloaded (netpeeker) (Download total) Client 1 (1mbit) Client 2 (2mbit) Client 3 (4mbit) Client 4 (8mbit) Client 5 (1mbit) Client 6 (2mbit) Client 7 (4mbit) Client 8 (8mbit) Client 9 (1mbit) Client 10 (2mbit) Client 11 (4mbit) Client 12 (8mbit) 1,987 1,886 2,013 3,984 2,055 2,040 2,018 3,851 2,066 1,859 1,736 4,097 0 0 0 0 0 0 0 0 0 0 0 0 1,996 1,891 2,015 3,985 2,060 2,046 2,022 3,855 2,066 1,865 1,741 4,102 1,987 1,886 2,013 3,984 2,055 2,040 2,018 3,851 2,066 1,859 1,736 4,097 2,984 2,841 2,586 2,347 2,884 2,845 2,562 2,288 2,802 2,654 2,455 2,307 766 909 1,164 1,403 866 905 1,188 1,462 948 1,096 1,295 1,443 3,759 3,758 3,755 3,754 3,760 3,756 3,759 3,757 3,757 3,752 3,758 3,753 3,750 3,750 3,750 3,750 3,750 3,750 3,750 3,750 3,750 3,750 3,750 3,750 Server (1000mbit) 1,963 13,445 15,410 15,408 0 0 296 0 31,555 29,592 13,445 45,053 29,592 31,555 31,555 13,445 45,076 45,000 Totals 7.3.15 Penultimate Test Results 7.3.16 Final Test Results Page 67 of 75 Mat Thomas S75245/BAFM1107 7.3.17 Spec Sheet of Proposed Hardware (from module 301: Business Plan) Mat Thomas! S75245/BAFM1107 Specifications Power • Built in power supply • Built in UPS Internals • Intel P4 processor • 250GB HDD • 2GB RAM Externals • Power Connector • USB Port • Network Port • HDMI • Component • RCA Audio • Optical Audio • Wireless 802.11n 2 In the box: • hi-vue • hi-vue remote • power cable • HDMI cable Size & Weight • Footprint: 100x300mm • Height: 300mm • Weight: 1.5kg TV Output • HDTV compatible only • Minimum: 720p Software • Hi-Vue OS (Based on Linux) • Bootloader (Multiple OS’s) • Hi-Vue media browser • Hi-Vue media player Formats: • AVI • MPEG2 • MPEG4 • H.264 • WMV • .ts • BD-DVD • HD-DVD • Maximum: 2k Page 6 of 11 Page 68 of 75 Mat Thomas S75245/BAFM1107 7.3.18 Specialised Application (from module 302: Specialised Application) Mat Thomas! S75245/BAFM1107 Introduction Within this report I will attempt to present my findings with regards to the perceivable quality differences between encoding/compression parameters of H.2641 video. Hypothesis The relationship between various settings of the encoding module of an H.264 video file and perceivable quality by the user would tend towards a logarithmic curve if placed on a simple filesize/quality graph, hence an optimum point must exist. A more simplified version of the above hypothesis would be to say that as the file-size of a movie file increases, the benefits to a viewer decelerate and that most people are unable to differentiate between marginal drops in quality. Application The results from the tests contained in this paper will allow for a better understanding of an “optimum” point of file-size/quality for viewing movie files. If we are able to identify the “roll-off” point (where increases in settings/file-size no longer have a major impact on the increase of perceived quality changes) then we can restrict the amount of “wasted” data. Learning Outcomes At the end of this project I aim to have a greater understanding of the H.264 file format, more specifically how the parameters of the encoding module affect the final product. For example, by decreasing the FPS (frames per second), and therefor the amount of data contained in the file, will I be able to tell the difference between the original, and if so, how much of a difference? Activities The first step in this project is to obtain uncompressed footage as the “control” movie file, as this will be predominately an experiment format project. For this footage I have decided to shoot a short clip myself, using my own equipment. I have decided that it is best to include the following variables in the captured material; fast on-set motion, colour changes and luminance changes. I chose to take a time-lapse from my apartment from mid-day until dusk, resulting in all of the above needed factors and an 11 second clip. Once the clip has been transferred to a machine capable of processing raw HD footage, it will need to be converted many times over to produce an array of clips which can be analysed. Because the learning outcomes are very subjective and personal, I must outline what I will be looking for in the clips before I can make judgements. Page 3 of 16 Page 69 of 75 Mat Thomas S75245/BAFM1107 Mat Thomas! S75245/BAFM1107 This is a still frame capture of frame 1 from the uncompressed control file. I will attempt to identify the overall “perceivable quality” based on the following factors: Sharpness ! - Using the billboard (blue) Motion!! - Using the trees on the left and the clouds (yellow) Colour!! - Using the sunset from the sky and the turquoise building in the foreground (green) Contrast! - The difference between the top left and bottom right of the frame (red) The selection boxes are an attempt to make the reviewing and analysis of clips more precise, however for an overall score it is also necessary to look at the entire frame/clip to gauge an approximate “perceived quality level”, as after all, this is what the research is concerned with. Please note that due to PDF compression techniques 2, there is a possibility that the subtle differences being evaluated in this paper could be lost. Resources • Mac with OS X Leopard3 • Quicktime Pro4 • Uncompressed 720p movie file with relevant indicators of perceived quality • (The Technology of) Video & Audio Streaming, Austerberry, David, Focal Press, 2005 Page 4 of 16 Page 70 of 75 Mat Thomas S75245/BAFM1107 Mat Thomas! S75245/BAFM1107 Activity 1st September ! - Capture video 2nd - 6th September !- Convert video 8th September! - Select clips for testing 9th - 12th September!- Analyse clips 15th September! - Produce statistics/graphs from analysis 17th - 19th Sept.! - Write-up introduction 22nd - 25th Sept.! - Write-up reflections 26th - 29th Sept.! - Proof read/references As the video is in time-lapse format, it took most of the day to complete. The video conversions had to be done manually and could take up to 3 hours to complete just one. The selection of clips to be used was based on the resources and personal choice. Page 5 of 16 Page 71 of 75 Mat Thomas Mat Thomas" S75245/BAFM1107 S75245/BAFM1107 Reflection on Personal Learning Outcomes As I am already familiar with the reasoning for encoding/compression of video, and the industry leaders, the technical aspect of this project went as smoothly as I had hoped. The ultimate aim of learning about the finer details of the H.264 codec was very daunting, however. From my research online and in books 5, I was able to gain a very good understanding of the parameters of video encoding. I had already chosen to ignore the audio track so my focus was simply on the parameters offered by the encoding dialogue of Apple!s H.264 video codec. The above picture is what is presented by the encoder at the stage of output. It enables us to change the following variables 6: • Frame rate • Number of key frames • Data rate • Encoding quality • Number of encoding passes Page 6 of 16 Page 72 of 75 Mat Thomas Mat Thomas! S75245/BAFM1107 S75245/BAFM1107 By altering the variables the preview (static frame) changes to give you a rough representation of the output quality. This was not sufficient for my analysis as it is not a movie file and only represents one frame of the clip, thus disabling me from analysing the majority of my test criteria. There are some very complex mathematical calculations involved in the encoder section, which I had to learn about before exporting the video required for the test. The equations used will not be explained in this paper, however I will attempt to outline the key points which I learnt for the benefit of the reader. If we take the analogy of a pipe with water flowing through it for the video clip, the same analogy used for general data flow of computer communications, then we can make the following links: • The diameter of the pipe can be related to the data rate • The amount of water in any cross-section of the pipe can be related to the quality • The pressure of the water can be related to the frame-rate In order to better understand the key-frames and the encoding passes we must use a different analogy. Key-framing, with respect to video encoding, is much the same as key-framing in a Flash7 based environment; it is a very specific representation of that particular frame. In video encoding, a keyframe is literally a picture of that frame, with non-keyframes holding only the differences between itself and the previous frame. I have learnt that key-framing is very useful in Flash based work as you can specify exactly what frames should be keyframes; change of scene, detailed motion etc, and to use “tweens” in between keyframes is a very efficient method of working. The downside to keyframes with respect to video is that you cannot easily choose which frames need to be keyframes, simply “all, every x frames or automatic. If a keyframe does not appear on the initial frame of a scene change, the potential for creating unwanted data in the form of unnecessary equations to represent the differences is greatly increased. For example: a keyframe takes up 100kb of data while a simple change of the scene (someone moving their lips while talking in an interview situation) will only take up 10kb of data, as we are only storing the changes, in this case the lips, and nothing else. When the scene changes without a new key-frame, the entire frame has changed and trying to represent this in an equation based on the previous frame will undoubtably take up more than 100kb of data. Therefor it is actually more efficient to insert a new key-frame rather than using the otherwise highly efficient compression process. Page 7 of 16 Page 73 of 75 Mat Thomas S75245/BAFM1107 Mat Thomas! S75245/BAFM1107 The problem we encounter here is how to choose the key-frames effectively. A method employed by the encoder module is to first run through the entire clip and map out the changes frame by frame, on a percentage basis. For example, in the above example of someone giving an interview, most of the frames will only change by about 5-10%, however if the scene was to change to something completely different, the change would be close to 100%. The encoder knows its own efficiency levels and for example if the change was over 50% it would mark that frame to become a key-frame. The number of encoding passes is actually one of the most simplest parameters to understand, once we cut through the discourse language: single pass encoding will attempt to fill out a constant bit-rate (per frame) and a multi-pass will attempt to fill out to a file-size based on an average bitrate. This allows to distribute the amount of data either evenly throughout the file (single pass), or wherever the data is “needed” most (multi pass). These two settings will produce a CBR (constant bit-rate) file (single-pass), and VBR (variable bit-rate) file (multi pass). Based on my learning of all of the above, it is clear to see that the data-rate is possibly the single most important variable; if it is too high then you are wasting data, if it is too low then you will not be able to fit in the required data. Because of this I decided to leave the data-rate set to automatic (the encoder would fill out the data rate based on the other settings), leave the encoding on multipass (it is more efficient but takes more time to encode), and leave the key-framing on automatic also (as detailed above the encoder can figure out when to insert key-frames). Therefor I decided to produce the following files, all created from the control file so there was no chance of re-compression artifacts: Frame Rate (FPS) Key Frame Every x Frames Data Rate (kbits/sec) Quality (percent) Encoding File-size (MB) Control 25 All 60500 100 N/A 81.2 Quality 1 Quality 2 Quality 3 Quality 4 Quality 5 25 25 25 25 25 Auto Auto Auto Auto Auto 17358 4571 1043 625 531 100 75 50 25 1 Multi-Pass Multi-Pass Multi-Pass Multi-Pass Multi-Pass 22.8 6.0 1.4 0.8 0.7 FPS 1 FPS 2 FPS 3 FPS 4 FPS 5 24 23 22 21 20 Auto Auto Auto Auto Auto 16666 15939 15249 14628 13949 100 100 100 100 100 Multi-Pass Multi-Pass Multi-Pass Multi-Pass Multi-Pass 21.9 20.9 20.0 19.2 18.3 Page 8 of 16 Page 74 of 75 Mat Thomas S75245/BAFM1107 Mat Thomas" S75245/BAFM1107 Once I had these files, and before reviewing any of them for perceivable quality, based on my mathematical learning and observation, the compressor quality vs file-size relationship was non-linear as a 25% reduction in quality was over 300% reduction in file-size, affirming my initial hypothesis. By this stage I had already learnt the basics of encoding using the H.264 compression method, however I wanted to refine the values based on combining more than one variable in order to hit the “sweet spot” of perceivable quality vs file-size. To stick to the learning outcomes I would have to ensure that this wasn!t a trial and error approach, so in order to do this I have to analyse the above movie clips to result in a numerical form of perceivable quality (see appendices). Once this is complete a refinement of clips can be made and then reviewed in a similar manner. A full breakdown can be seen in the appendices, however the resultant movie files chosen are as follows: Frame Rate (FPS) Combine 1 Combine 2 Combine 3 Combine 4 Combine 5 24 24 23 23 22 Key Frame Every x Frames Auto Auto Auto Auto Auto Data Rate (kbits/sec) 16666 4395 15939 4207 15249 Quality (percent) 100 75 100 75 100 Encoding Multi-Pass Multi-Pass Multi-Pass Multi-Pass Multi-Pass File-size (MB) 21.9 5.8 20.9 5.5 20.0 Based on the findings from the 10 initial clips, I set a threshold of 85% overall perceivable quality and tested the above combinations. I have learnt that the setting of encoding quality affects the file-size much more dramatically than the FPS, however as I intended to learn about the most efficient settings, the combination of FPS and encoding quality both play a part in ultimate file-size. Whilst maintaining an “85%” quality level and keeping file-size as low as possible, I conclude that the “Combine 2” settings are the optimum for this particular clip. A decrease in perceived quality of only 15% but a decrease in file-size by a factor of 14 times. Page 9 of 16 Page 75 of 75