Techila with MATLAB
Transcription
Techila with MATLAB
END-USER GUIDE TECHILA WITH MATLAB NOVEMBER 08, 2013 Disclaimer Techila Technologies Ltd. disclaims any and all warranties, express, implied or statutory regarding this document or the use of thereof by you to the full extent permitted by law. Without limiting the generality of the foregoing, this document provided by Techila Technologies Ltd. in connection therewith are provided “as-is” and without warranties of any kind, including, without limitation, any warranties of performance or implied warranties of merchantability, fitness for a particular purpose, title and noninfringement. Further, Techila Technologies Ltd. does not make, and has not made, any presentation or warranty that the document is accurate, complete, reliable, current, error-free, or free from harmful information. Limitation of Liability In no event shall Techila Technologies Ltd. or any of its respective directors, officers, employees, or agents, be liable to you or any other person or entity, under any theory, including without limitation negligence, for damages of any kind arising from or related to the application of this document or any information, content, or materials in or accessible through this document, including, but not limited to, direct, indirect, actual, incidental, punitive, special or consequential damages, lost income, revenue or profits, lost or damaged data, or other commercial or economic loss, that result from your use of, or inability to use, this document, even if any of those persons or entities have been advised of the possibility of such damages or such damages are foreseeable. Use of this document and copyright No part of this document may be used, reproduced, modified, or transmitted in any form or means without the prior written permission of Techila Technologies. This document and the product it describes are considered protected by copyrights and other intellectual property rights according to the applicable laws. Copyright Techila Technologies 2010-2013. All rights reserved. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Table of contents 1 2 3 4 5 6 Introduction.............................................................................................................................................. 4 1.1 Preparation.................................................................................................................................... 5 1.1.1 Modifying the MATLAB search path .............................................................................. 5 1.1.2 Choosing the compiler for MATLAB applications .......................................................... 6 1.2 Example material .......................................................................................................................... 6 1.3 Naming convention of the MATLAB m-files .................................................................................. 7 1.4 MATLAB peach-function ............................................................................................................... 7 1.5 MATLAB cloudfor-function ............................................................................................................ 7 Cloudfor examples ................................................................................................................................ 11 2.1 Controlling the number of Jobs ................................................................................................... 12 2.2 Specifying which workspace variables will be transferred .......................................................... 15 2.3 Transferring data files ................................................................................................................. 19 2.4 Managing return values .............................................................................................................. 23 Peach tutorial examples ........................................................................................................................ 28 3.1 Executing a MATLAB program on Workers ................................................................................ 29 3.2 Using input parameters ............................................................................................................... 32 3.3 Transferring data files ................................................................................................................. 36 3.4 Distributing nested loops ............................................................................................................. 39 3.5 Local Testing ............................................................................................................................... 42 Peach feature examples ....................................................................................................................... 43 4.1 Distributing Monte Carlo Pi with the peach-function ................................................................... 45 4.2 Snapshot ..................................................................................................................................... 48 4.3 Streaming & Callback Function................................................................................................... 51 4.4 Job Input Files ............................................................................................................................. 54 4.5 Precompiled Binaries .................................................................................................................. 57 4.6 MEX files ..................................................................................................................................... 61 4.7 Project Detaching ........................................................................................................................ 64 4.8 Iterative Projects ......................................................................................................................... 67 4.9 Remote Compiling ...................................................................................................................... 69 Troubleshooting .................................................................................................................................... 71 Appendix ............................................................................................................................................... 72 6.1 Appendix 1: Examples on Cloudfor control parameter definitions .............................................. 72 ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 1 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 4/74 Introduction This document is intended for Techila End-Users who are using MATLAB as their main development environment. If you are unfamiliar with Techila terminology or the operating principles of the Techila technology, information on these can be found in Techila Fundamentals document. The structure of this document is as follows: Chapter 1 contains important information regarding the preparation steps required to use Techila with MATLAB. This chapter also contains a brief introduction on the naming convention of the MATLAB m-files and introduces the two functions, peach and cloudfor, which are used for distributing computations from MATLAB to the Techila environment. Chapter 2 contains walkthroughs of code samples that use the cloudfor-function. The example material demonstrates how functions containing locally executable for-loop structures can be converted to distributed versions by using cloudfor-loops. The examples also illustrate how to use control parameters to manage return values and to transfer additional data files to Workers. After examining the material in this Chapter you should be able to convert normal for-loops in to cloudfor-loops and manage return values accordingly. Chapter 3 contains walkthrough of simplistic example code samples that use the peach-function. The example material illustrates how to control the core features of the peach-function, including defining input arguments and transferring data files with the executable program. Examples are also provided on how to distribute nested loop structures and for performing local testing on your own computer. After examining the material in this Chapter you should be able split a simple locally executable program into two pieces of code (Local Control Code and Worker Code), which in turn can be used to perform the computations in the Techila environment. Chapter 4 contains several examples that illustrate how to implement different features available in the MATLAB peach-function. Each subchapter in this Chapter contains a walkthrough of an executable piece of code that illustrates how to implement one or more peach-function features. Each Chapter is named according to the feature that will be the focused on. After examining the material in this Chapter you should be able implement several features available in the MATLAB peach-function in your own application. Chapter 5 contains a short troubleshooting guide for some of the most frequently encountered error scenarios. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 1.1 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 5/74 Preparation This Chapter contains information of the preparation steps required before you can access the Techila environment from your MATLAB. These steps include: Modifying the MATLAB search path Choosing the compiler for MATLAB applications 1.1.1 Modifying the MATLAB search path Before proceeding, please add the folder containing the Techila MATLAB toolbox functions to the MATLAB search path. This can be done by following the steps listed below. 1. Change your working directory in MATLAB to the following directory <full path>\techila\lib\Matlab 2. Execute the following command: installsdk This will add the path containing the Techila MATLAB toolbox functions to the MATLAB search path and will attempt to save the search path for use in future MATLAB sessions. This is performed by saving the search path to a file called 'pathdef.m'. Note! If you do not have rights to modify the 'pathdef.m' file, you might receive a warning message. If this occurs, the MATLAB search path might need to be updated at the start of a new MATLAB session. If you wish to make the path update process automatic, but do not have rights to modify the 'pathdef.m' file, you can update the MATLAB search path by for example using the 'userpath' or 'addpath' commands. These commands can be added to the 'startup.m' file in the MATLAB startup folder, which will update the MATLAB search path when launching MATLAB. Examples of the commands are shown below: userpath('<full path>\techila\lib\Matlab') Or addpath('<full path>\techila\lib\Matlab') The <full path> notation should be replaced with the path leading to the 'techila' folder on your system. More information on how to modify the userpath and startup options can be accessed from MATLAB with the following commands: doc userpath doc startup doc addpath ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 6/74 1.1.2 Choosing the compiler for MATLAB applications Computational MATLAB Projects require that a local MATLAB compiler is available for compiling the Worker Code. Before proceeding, please ensure that MATLAB is configured to use a suitable compiler. You can setup your MATLAB compiler by executing the following command in MATLAB: mbuild –setup Please note that compiled binaries are platform specific and can only be executed on Workers with a compatible platform according to the table below. Compilation Platform and MATLAB Version Windows, 32-bit MATLAB Compatible Worker Platforms Windows 32-bit Windows 64-bit Windows, 64-bit MATLAB Windows 64-bit Linux, 32-bit MATLAB Linux 32-bit Linux, 64-bit MATLAB Linux 64-bit Mac OS X, 64-bit MATLAB Mac OS X 64-bit 1.2 Example material The MATLAB m-files containing the example material examined in this document are located under the "cloudfor", "Tutorial" and "Features" folders in the Techila SDK. The "cloudfor" folder contains example material that illustrates the use of the cloudfor -function. The "Tutorial" and "Features" folders contain examples that use the peach-function. The actual m-files containing the code snippets, which can be used to run the examples, are located in subfolders under these folders. Examples located in the "cloudfor" folder are examined in Chapter 2, examples located in the "Tutorial" folder are examined in Chapter 3 and examples in the "Features" folder are examined in Chapter 4. techila Examples Matlab cloudfor Chapter 2 Tutorial Chapter 3 Features Chapter 4 Figure 1. The example material examined in this document is located in the under the “Matlab” folder in the Techila SDK. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 1.3 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 7/74 Naming convention of the MATLAB m-files The typical naming convention of MATLAB m-files presented in this document is explained below: MATLAB m-files ending with "_dist" contain the Worker Code, which will be compiled and distributed to the Workers when the Local Control Code is executed. MATLAB m-files beginning with "run_" contain the Local Control Code, which will create the computational Project when executed locally on the End-Users computer. MATLAB m-files beginning with ”local_” contain locally executable code, which does not communicate with the Techila environment. Please note that some m-files and/or functions might be named differently, depending on their role in the computations. 1.4 MATLAB peach-function The MATLAB peach-function provides a simple interface that can be used to distribute even the most complex programs or precompiled binaries. The general syntax is shown below. peach(funcname, params, [files], peachvector, ...) The peach-function call deploys and executes the function 'funcname' in the Techila environment with the input arguments defined in the 'params' array. The 'files' array is an optional parameter, which can be used to transfer data files to all participating Workers. Even the most minimalistic peach syntax will always contain the three mandatory parameters: 'funcname', 'params' and 'peachvector'. The syntax for using these three parameters is shown below: peach(funcname, params, peachvector) As can be seen, the 'files' array does not need to be defined if no additional data files need to be transferred. If the executable function does not require any input arguments, empty curly brackets {} can be used to create an empty 'params' array as shown below: peach(funcname, {}, peachvector) The use these parameters is illustrated in the Tutorial in Chapter 3. General information on available peach parameters can also be displayed by executing the following command in MATLAB. doc peach 1.5 MATLAB cloudfor-function The cloudfor-function provides an even more simplistic way to distribute computationally intensive loop structures to the Techila environment. The local for-loop structure that will be distributed and executed on Workers is marked with two keywords: 'cloudfor' and 'cloudend'. These keywords mark the beginning and end of the loop structure that will be executed on the Workers. Figure 2 shown below illustrates the general syntax for marking a block of code for simultaneous execution: ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 8/74 Figure 2. Converting 'for'-loop structures to 'cloudfor'-loop structures enables you to execute the computationally intensive operations in the Techila environment. The <executable code> notation in the example above represents the operations that will be executed during each iteration of the loop structure. The 'initval' and 'endval' variables correspond to the same values as in the regular for-loop, representing the start and end values for the loop counter. Please note that all iterations of the cloudfor-loops will be performed independently on Workers, meaning all computational operations must also be independent. For example, the conversion shown below is possible as all the iterations are independent. A=zeros(10,1); for i=1:10 A(i)=i; end Conversion Possible A=zeros(10,1); cloudfor i=1:10 A(i)=i; cloudend But it is NOT possible to convert the loop structure shown below. This is because the value of 'A' in the current iteration (e.g. i=3) depends on the value of the previous iteration (i=2). A=5; for i=1:10 A=A+A*i; end Conversion NOT Possible Recursive dependency in the local for-loop, cloudfor-loop cannot be used. When the 'cloudfor' keyword is encountered, all variables currently defined in the MATLAB Workspace will be transferred to Workers. The number of Jobs will, by default, be automatically set by executing the block of code once on the End-Users own computer and using the execution time to determine a suitable number of Jobs. The number of iterations performed in a single Job can also be controlled with the 'stepsperworker' control parameter as shown below. cloudfor var=1:10 %cf:stepsperworker=2 <executable code> cloudend In the example above, five (5) Jobs would be created. This is because as the maximum value of the loop counter is ten (10) and 'stepsperworker' parameter defines that two (2) iterations would be executed in each Job. It is also possible to use control parameters available for the peach-function by using the '%cf:peach' syntax as illustrated in the example below. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 9/74 cloudfor var=1:10 %cf:stepsperworker=2 %cf:peach RemoteCompile='true' <executable code> cloudend It is also possible to distribute nested for-loop structures. This means that if you have a locally executable nested for-loop structure, you can distribute the computations to the Techila environment by marking the executable code as shown below. cloudfor var1=1:10 %cf:stepsperworker=1 <executable code> cloudfor var2=1:10 %cf:stepsperworker=1 <executable code> cloudend cloudend In the example above, the value of the 'stepsperworker' parameter is set to one (1) for both cloudfor-loops. This means that the total number of Jobs in the Project would be 100. It is also possible to use several cloudfor statements on the same level as shown below cloudfor i=1:10 <executable code> cloudend cloudfor j=1:10 <executable code> cloudend But it is NOT possible to use cloudfor statements on the same level when inside a cloudfor statements. cloudfor i=1:10 cloudfor j=1:10 <executable code> cloudend cloudfor k=1:10 <executable code> cloudend cloudend General information on available control parameters can also be displayed by executing the following command in MATLAB. doc cloudfor ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 10/74 Please note that cloudfor-loops should only be used to divide the workload in computationally intensive loop structures. If you have a small number of computationally light operations, using cloudfor-loops will not result in better performance. As an exception to this rule, some of examples discussed in this document will be relatively simple, as they are only intended to illustrate the mechanics of using cloudfor-loops. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 2 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 11/74 Cloudfor examples This Chapter contains walkthroughs of the example material that use the cloudfor-function. The examples in this Chapter highlight the following subjects: Controlling the number of Jobs Specifying which workspace variables will be transferred Transferring data files Managing return values The example material used this Chapter, including m-files and data files can be found in the subfolders under the following folder in the Techila SDK: techila\examples\Matlab\cloudfor\<example specific subfolder> Please note that the example material in this Chapter is only intended to highlight some of the available features in the cloudfor-function. For a complete list of available control parameters, execute the following command in MATLAB. doc cloudfor ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 2.1 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 12/74 Controlling the number of Jobs This example is intended to illustrate how to distribute the iterations an extremely simple for-loop structure. Executable code snippets are provided of a locally executable for-loop structure and a distributed version that uses cloudfor-loops. Information on how to control the number of iterations calculated during a single Job is also provided. The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\cloudfor\1_number_of_jobs Specifying the number of iterations performed in each Job When using cloudfor-loops to distribute a for-loop structure, the maximum number of Jobs in the Project will be automatically limited by the number of iterations in the loop. For example, the loop structure below contains 10 iterations, meaning that the Project can contain a maximum of 10 Jobs. cloudfor counter=1:10 <executable code> cloudend When using the cloudfor-function, one iteration of the loop structure will be executed locally on your computer. This execution time will be used to automatically estimate how many iterations should be performed in a single Job. If the execution time required for the locally executable iteration is short, multiple iterations will be automatically grouped in each Job. If the locally executable iteration requires a long time, the number of iterations performed in each Job will be set to one (1). If you require more control over the number of iterations that will be performed in each Job, this can be achieved by using the 'stepsperworker' control parameter. The general syntax for using this control parameter is shown below: %cf:stepsperworker=<iterations> Where the <iterations> notation can be used to define the number of iterations that should be performed in each Job. For example, the syntax shown below would define that each Job in the Project should perform two iterations: %cf:stepsperworker=2 Please note that when using the 'stepsperworker' parameter, you will also fundamentally be defining the length of a single Job. If you only perform a small number of short iterations in each Job, the Jobs might be extremely short, resulting poor overall efficiency. It is therefore strongly advised to use values that ensure the execution time of a Job will not be too short. Locally executable program The locally executable program used in this example is shown below. function result=local_loops(loops) result=zeros(1,loops); for counter=1:loops result(counter)=counter*counter; end ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 13/74 The code contains a single for-loop, which contains a single multiplication operation where the value of the 'counter' variable is squared. The value of the 'counter' variable will be replaced with the iteration number, which will be different each iteration. The result of the multiplication will be stored in the result-vector, which will be used to contain all multiplication results. The locally executable program can be executed by changing your current working directory in MATLAB to the directory containing the material for this example and executing the command shown below: result=local_loops(10) Executing the command shown above will calculate 10 iterations and store the following variables to the 'result' vector. The values generated during this simple program are illustrated in Figure 3 below. Figure 3. Results of the latest iteration are stored in the result-vector. The result of the latest iteration has been circled in this figure. The distributed version of the program The distributed version of the locally executable program that uses cloudfor-loops is shown below. function result=run_loops_dist(loops) result=zeros(1,loops); cloudfor counter=1:loops %cf:stepsperworker=2 result(counter)=counter*counter; cloudend As can be seen, the locally executable for-loop has been replaced with a cloudfor-loop. The 'for' and 'end' words in the locally executable loop have been replaced with the 'cloudfor' and 'cloudend' keywords. This indicates that the block of code that is located in the loop structure will be executed on Workers. The 'stepsperworker' control parameter has been used to define that two iterations should be calculated in in each Job. This means that for example if the number of loops is set to 10, the number of Jobs will be 5 (number of loops divided by the value of the 'stepsperworker' parameter). The distributed version version can be executed by changing your current working directory in MATLAB to the directory containing the material for this example and executing the command shown below: result=run_loops_dist(10) After you have executed the command, the computational code will be compiled by using the MATLAB compiler. Depending on your system, the compilation will take roughly one minute. After the compilation is completed, the Project will be automatically created and will consist of five (5) Jobs. These Jobs will be ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 14/74 computed on Workers and the results will be streamed back to your computer. Results generated during the Jobs will be stored in the 'result' vector at the applicable indexes. This is illustrated below in Figure 4. Figure 4. Each Job will calculate two iterations and will return the values. Please note that even though Jobs are not completed in any given order, cloudfor will automatically store the values of the computation at the correct indices. The results of the latest Jobs have been circled in this figure. Tip! To view the effect of the 'stepsperworker' control parameter, enter a different value for the parameter. You can also delete the line containing the control parameter entirely. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 2.2 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 15/74 Specifying which workspace variables will be transferred By default, all MATLAB Workspace variables will be transferred to all participating Workers. This can sometimes result in unnecessarily large network transfers, where Workspace variables will be transferred to Workers even though they will not be required in the computation. After the Workers have finished the computational Jobs, all Workspace variables which have changed will be automatically returned and transferred back to then End-Users computer. This can also result in unnecessary network transfers if only some variables are required. This example illustrates how to specify which MATLAB Workspace variables should be transferred to Workers and which variables should be returned. This is most beneficial in scenarios, where the Workspace contains e.g. large matrixes that consume large amounts of memory, but are not required on Workers or do not contain meaningful results. Specifying Workspace variables in these kinds of situations will reduce the amount of data that will be transferred and can improve overall performance. The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\cloudfor\2_workspace_variables Specifying transferrable Workspace variables Workspace variables to be transferred to Workers are specified with the 'inputparam' control parameter. The general syntax for defining Workspace variables is shown below. %cf:inputparam=<comma separated values> Where the <comma separated values> notation should be replaced with the names of the Workspace variables that should be transferred to Workers. For example, the following notation specifies that only variables called 'var1' and 'var2' should be transferred. %cf:inputparam=var1,var2 Please note that when using the 'inputparam' parameter, you will need to ensure that all necessary Workspace variables are listed. These might also include any possible vector and/or matrices that are used to contain the computation results. Global variables can be transferred by using the 'global' keyword. For example, the following notation specifies that variables 'var1' and 'var2' and all global variables should be transferred to the Workers. %cf:inputparam=var1,var2,global Please note that if you wish to transfer workspace variables that require more than 10 Megabytes memory, you will also need to use a control parameter to enable the variables to be transferred. This is a safety measure, which is intended to prevent large, accidental data transfers. To override the 10 Megabyte limit, use the parameter shown below. %cf:force:largedata Workspace variables to be returned from Workers are specified similarly as a comma separated value list by the using the 'outputparam' control parameter. The general syntax for defining Workspace variables that will be returned is shown below. %cf:outputparam=<comma separated values> For example, the following notation specifies that only variables called 'out1' and 'out2' should be returned. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 16/74 %cf:outputparam=out1,out2 Locally executable program The locally executable program used in this example is shown below. function result = local_workspace var1=[2 4 6]; var2=[11 51 100]; dummyvar=rand(1000); result=zeros(length(var1),length(var2)) for ii=1:length(var1) for jj=1:length(var2) result(ii,jj)=var1(ii)*var2(jj); end end The program contains a single function, which will multiply elements in the vectors 'var1' and 'var2'. The result of this multiplication will be stored in the 'result' matrix (size of 3x3) at the coordinates indicated by the loop counter indices. The function also defines a variable called 'dummyvar', which is created simply for illustrational purposes and is not used in the operations performed in the loop structure. The purpose of this variable is to demonstrate how using the control parameter to specify transferrable Workspace variables can reduce the amount of network transfer. This is explained in more detail later in this Chapter. The locally executable program can be executed by changing your current working directory in MATLAB to the directory containing the material for this example and executing the command shown below: result=local_workspace Executing the command shown above will calculate 10 iterations and store the following variables to the 'result' vector. The values generated during this simple program are illustrated in Figure 5 below. Figure 5.Result of the latest iteration is stored in the 'result' matrix at the coordinates indicated by the loop counters. The result of the latest iteration has been circled in this figure. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 17/74 The distributed version of the program The distributed version of the locally executable program is shown below. function result = run_workspace_dist() var1=[2 4 6]; var2=[11 51 100]; dummyvar=rand(1000); result=zeros(length(var1),length(var2)) cloudfor ii=1:length(var1) %cf:stepsperworker=1 cloudfor jj=1:length(var2) %cf:stepsperworker=1 %cf:inputparam=var1,var2,result result(ii,jj)=var1(ii)*var2(jj); cloudend cloudend The code contains two nested cloudfor-loops corresponding to the two nested for-loops in the locally executable program. Each cloudfor-loop also sets value of the 'stepsperworker' parameter to one (1), meaning the number of Jobs in the computational Project will correspond to the total number of iterations. As each loop contains three iterations, the number of Jobs in the Project will be nine (9). The Workspace variables that should be transferred are listed on the line containing the 'inputparams' control parameter shown below. %cf:inputparam=var1,var2,result This parameter defines that the variables 'var1', 'var2' and 'result' should be transferred to the Workers. Variables 'var1' and 'var2' are required to perform the multiplication operation and the 'result' variable is required to store the multiplication result. The 'dummyvar' variable is not used in the computation meaning there is no need to transfer it and therefore it is not listed. The distributed version can be executed by changing your current working directory in MATLAB to the directory containing the material for this example and executing the command shown below: result = run_workspace_dist Executing the command shown above will calculate the value of each matrix element in a separate Job. This is illustrated below in Figure 6. Figure 6. Each Job will calculate the value of one matrix element. Please note that results are received from Workers in the order in which they are completed, meaning the matrix will not be filled in any specific order. Each result will be automatically stored in the correct coordinates. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 18/74 Tip! To see how the 'inputparam' parameter reduces the amount of data transferred, remove the line containing the 'inputparam' notation. This will cause the 'dummyvar' variable to be transferred to the Workers. This will raise the amount of data over the 10 Megabyte limit, meaning you will also have to override the 10 Megabyte safety limit discussed earlier in this chapter with the %cf:force:largedata parameter. After you have modified the code, you should see a much larger upload when creating the computational Project. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 2.3 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 19/74 Transferring data files By default, only the file containing the executable program will be transferred to the Workers. If the executable program requires access to additional data files during the computation, the data files need to be specified in order to transfer them to the Workers. This example illustrates how to transfer input files to Workers and how to transfer output files from the Workers back to your computer. The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\cloudfor\3_transferring_data_files Specifying which files should be transferred Input files can be transferred to all participating Workers by using the 'datafile' control parameter. The general syntax for defining files is shown below. %cf:datafile={{<comma separated list of files>}} For example, the syntax shown below defines that the files 'file1' and 'file2' should be placed in a Data Bundle and transferred to Workers. %cf:datafile={{'file1','file2'}} These files will be automatically copied to the same temporary working directory as the executable program. Files can also be placed in different Data Bundles by using several 'datafile' parameter entries. This can be beneficial for example in situations where some of the files change frequently and some of the files remain unchanged. This is because Bundles are only recreated if the content of the Bundle has changed, meaning placing frequently changing files in one Data Bundle and static files in another Bundle will reduce the amount of unnecessary data transfers. The syntax shown below illustrates how files can be placed in different Data Bundles. Files 'file1' and 'file2' are placed in one Data Bundle and files 'file3' and 'file4' are placed in the second Data Bundle. %cf:datafile={{'file1','file2'}} %cf:datafile={{'file3','file4'}} Additional output files can be returned from Workers by using the 'peach OutputFiles' parameter. The general syntax for defining the names of output files is shown below. %cf:peach OutputFiles={<comma separated list>} For example, the syntax shown below defines that the files 'outputfile1' and 'outputfile2' transferred back from the Workers. %cf:peach OutputFiles={'outputfile1', 'outputfile2'} If the names of the result files are different each iteration, regular expressions can be used to return a larger selection of output files. For example, the syntax shown below would return all output files beginning with 'outputfile' from the Workers. %cf:peach OutputFiles={'outputfile.*'} Tip! If you wish to perform post-processing on the result files continuously as they are streamed to your computer, you can use '%cf:callback' notation and the 'TECHILA_FOR_RESULT' variable. The ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 20/74 variable 'TECHILA_FOR_RESULT' will be replaced with the path and name of each result file after it has been transferred to your computer. For example, the syntax shown below would unzip each result file to the current working directory after it has been downloaded to your computer from the Techila Server. %cf:callback unzip(TECHILA_FOR_RESULTFILE) Locally executable program The locally executable program used in this example is shown below. function result=local_datafile() result=cell(1,3); loops=length(result); for k=1:loops load(['input_' num2str(k) '.mat']); result1=conv2(x,y,'same'); result2=filter2(x,y); filename=['output_local' num2str(k) '.data']; save(filename,'result1','result2'); end for j=1:loops result{j}=load(['output_local' num2str(j) '.data'],'-mat'); end end The program contains a single function, which contains two for loop structures. During the first loop, data files are loaded from the current working directory. A different data file will be loaded each iteration and loading the data files will create two Workspace variables, 'x' and 'y'. These variables are used when performing 'conv2' and 'filter2' operations later in the code. These operations will create the 'result1' and 'result2' variables. These variables will be stored in an output file and the iteration number will be included in the name of the file. These output files will be stored in the current working directory. After all iterations in the first loop have been performed the second loop structure will be performed. During this loop structure, values stored in the output files are read from the current working directory and stored in the 'result' variable, which will also be returned from the function. The locally executable program can be executed by changing your current working directory in MATLAB to the directory containing the material for this example and executing the command shown below: result=local_datafile The distributed version of the program The distributed version of the locally executable program is shown below. function result = run_datafile_dist() result=cell(1,3); loops=length(result); cloudfor k=1:loops ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 21/74 %cf:stream=false %cf:stepsperworker=1 %cf:resultfilevar=filelist %cf:donotimport=true %cf:peach OutputFiles={'output_dist.*;regex=true'} %cf:datafile={{'input_1.mat','input_2.mat'}} %cf:datafile={{'input_3.mat'}} load(['input_' num2str(k) '.mat']); result1=conv2(x,y,'same'); result2=filter2(x,y); filename=['output_dist' num2str(k) '.data']; save(filename,'result1','result2'); cloudend for j=1:loops unzip(filelist{j}); result{j}=load(['output_dist' num2str(j) '.data'],'-mat'); end end The 'cloudfor' and 'cloudend' keywords are used to mark the first loop structure for execution on the Workers. The code sets value of the 'stepsperworker' parameter to one (1), meaning the number of Jobs in the computational Project will correspond to the total number of iterations. This means that the number of Jobs in the Project will be three (3). Streaming has been disabled with the 'stream=false' parameter. This ensures that the second forloop will be process output files in the same order as in the locally executable version. The parameter '%cf:donotimport=true' specifies that result files should not be loaded, but only added to the file list. Result files will be processed in the for-loop that will be executed after the Project has been completed. The parameter 'resultfilevar=filelist' defines that the name and path of each output file returned from the Workers will be stored in the 'filelist' variable. The locations and names stored in this variable will be used during the second for-loop, where each individual output file will be unzipped to the current working directory. The data files that will be transferred to all participating Workers are listed on the lines containing the 'datafile' control parameter shown below. %cf:datafile={{'input_1.mat','input_2.mat'}} %cf:datafile={{'input_3.mat'}} These parameters define that the files 'input_1.mat', 'input_2.mat'and 'input_3.mat' will be transferred to the Workers. There are two 'datafile' parameter entries, meaning the data files will be placed in two separate Data Bundles. Files 'input_1.mat' and 'input_2.mat' will be placed in one Bundle and file 'input_3.mat' will be placed in another Data Bundle. All files will be copied to the same temporary working directory with the executable program on the Workers. This means the data files can be loaded with the same command as in the locally executable version. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 22/74 The output files that will be returned from the workers are listed on the line containing the 'peach OutputFiles' parameter shown below. %cf:peach OutputFiles={'output_dist.*;regex=true'} The notation uses regular expressions and will cause all files starting with 'output_dist' to be returned from the Workers. These output files will be stored in ZIP-files, one output file per ZIP-file. Regular expressions are used because the name of the generated output file will be different each iteration. The second loop structure will unzip the ZIP-files containing the actual output files generated on Workers. These output files will be unzipped to the current working directory, from where they will be loaded and variables stored in the files loaded in the 'result' variable. The distributed version can be executed by changing your current working directory in MATLAB to the directory containing the material for this example and executing the command shown below: result = run_datafile_dist Executing the command shown above will create a Project consisting of three Jobs. Each Job will calculate one iteration of the loop structure. The return values will be stored in the 'result' cell array at the index indicated by the value of the loop counter 'k' for each Job. Tip! To see how using different Data Bundles for transferring different data files reduces unnecessary network transfers, modify the content of the 'input_3.mat'file. After modifying the content, execute the 'run_datafile_dist' command as earlier. After executing the command, only one of the Data Bundles (the one containing the 'input_3.mat' file) will be recreated. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 2.4 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 23/74 Managing return values This Chapter introduces different methods that can be used to manage return values received from the computational jobs. The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\cloudfor\4_managing_result_variables Default method By default the return values from the executable program are joined together by replacing the old values. This is typically desired behavior for example in cases where Jobs will return a result, which will be stored in an empty array. In situations where the result values need to be managed differently, cloudfor provides control parameters which can be used to sum, concatenate or replace the results. These are explained later in this Chapter The table below contains an example on how to convert a simple locally executable loop structure to a distributed version. Return values are managed by using the default method, meaning return values will be stored in an empty array at the correct indices. LOCALLY EXECUTABLE res=zeros(1,10) for index=1:10 res(index)=index*10 end DISTRIBUTED VERSION res=zeros(1,10) cloudfor index=1:10 res(index)=index*10 cloudend Concatenating values Concatenation means that return values are concatenated. Please note that when the streaming transfer method is used (streaming is enabled by default), the results will be concatenated in the order in which they are returned from the Workers. This means that the results might not be in the same order as in the locally executable version. To ensure that results are returned in the same order as in the locally executable version, disable streaming with the %cf:stream=false parameter. A return value will be concatenated if it is defined with the %cf:cat=<name of the variable> parameter. The names of several variables can be given as a comma separated list. For example, the control parameter shown below would concatenate the return values 'res1' and 'res2'. %cf:cat=res1,res2 The table below contains an example on how to convert a simple locally executable loop structure to a distributed version. The distributed version uses the %cf:cat parameter to concatenate a variable called 'res'. Please also note the value of the 'stream' control parameter is set to 'false', meaning results are returned in the same order as in the locally executable version. LOCALLY EXECUTABLE DISTRIBUTED VERSION res=[]; for index=1:10 res=[res index*10] end res=[]; cloudfor index=1:10 %cf:cat=res %cf:stream='false' res=[res index*10] cloudend ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 24/74 Summing values Summation means that the return values are summed, instead of being replaced. This can be beneficial for example in cases where the result of the latest iteration is carried over to the next iteration. A return value can be summed by using the %cf:sum=<name of the variable> parameter. Several variables can be given as a comma separated list. For example, the control parameter shown below would sum the return values 'res1' and 'res2'. %cf:sum=res1,res2 The table below contains an example on how to convert a simple locally executable loop structure to a distributed version. The distributed version uses the '%cf:sum' parameter to sum a variable called 'res'. LOCALLY EXECUTABLE DISTRIBUTED VERSION res=0; for index=1:10 res=res+index*10 end res=0; cloudfor index=1:10 %cf:sum=res %cf:stepsperworker=1 res=res+index*10 cloudend Replacing values Replacing means that the value of the variable will be continuously replaced with the latest result received from the Workers. This can be useful for example in cases where results from each iteration need to be stored in a different result file each iteration A return variable can be replaced by using the %cf:replace=<name of the variable> parameter. variables can be given as a comma separated list. For example, the control parameter shown below would replace the return values 'res1' and 'res2'. %cf:replace=res1,res2 The table below contains an example on how to convert a simple locally executable loop structure to a distributed version. The distributed version uses the '%cf:replace' parameter to replace variables called 'res' and 'i'. The 'stepsperworker' parameters defined that each Job will perform one iteration. The value stored in the 'res' variable will be saved in a file (file 'out1' for Job #1, file 'out2' for Job #2 and so on) to the current working directory. LOCALLY EXECUTABLE DISTRIBUTED VERSION res=0; res=0; for i=1:10 cloudfor i=1:10 res=i*10 %cf:replace=res,i save(['out'num2str(i)],'res') %cf:stepsperworker=1 end res=i*10 %cf:callback save(['out' num2str(i)],'res') cloudend ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 25/74 Locally executable program The locally executable program used in this example is shown below. function [pi_value, primes]=local_managing_results() iter_count=4e5; % Set the number of iterations pi_counter=0; % Init to contain the value of the Pi approximation primes=[]; % 0x0 matrix for the concatenated results for i_index = 1:iter_count pi_counter = pi_counter+mcpi; if isprime(i_index) primes=[primes i_index]; end end pi_value=4*pi_counter/iter_count; fprintf('\nThe approximate value of Pi is: %g\n\n',pi_value) fprintf('Searched the interval from 1 to %s\n',num2str(iter_count)) fprintf('Last 10 prime numbers found: %s\n',num2str(primes(end-9:end))) end function res = mcpi() res=0; dist = sqrt(rand()^2+rand()^2); if dist <= 1 res=1; end end The 'local_managing_results' function will begin by initializing values, which will used to perform the following activities: Determine the number of iterations in the loop structure ('iter_count') Keep count of the random points generated within the unitary circle ('pi_counter') Contain a list of prime numbers ('primes') The program contains a single for-loop structure, which will perform two different operations. These operations are described below: Sum and carry over any existing data generated during the random number generation routine to the 'pi_counter' variable. This is used in the Monte Carlo Pi approximation routine. Check and concatenate any prime numbers to the 'primes' vector The locally executable program can be executed by changing your current working directory in MATLAB to the directory containing the material for this example and executing the command shown below: [pi_value, primes]=local_managing_results(); After executing the command shown above, results should be displayed in the MATLAB Command Window containing information on the approximated value of Pi and the last 10 prime numbers found. The printout should resemble the one shown below. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 26/74 The approximate value of Pi is: 3.14099 Searched the interval from 1 to 400000 Last 10 prime numbers found: 399887 399899 399953 399979 399983 399989 399911 399913 399937 399941 The distributed version of the program The distributed version of the locally executable program is shown below. function [pi_value, primes]=run_managing_results_dist() iter_count=4e5; % Set the number of iterations pi_counter=0; % Init to contain the value of the Pi approximation primes=[]; % 0x0 matrix for the concatenated results cloudfor i_index = 1:iter_count %cf:sum=pi_counter %cf:cat=primes %cf:stream='false' %cf:stepsperworker=1e4 pi_counter = pi_counter+mcpi; if isprime(i_index) primes=[primes i_index]; end cloudend pi_value=4*pi_counter/iter_count; fprintf('\nThe approximate value of Pi is: %g\n\n',pi_value) fprintf('Searched the interval from 1 to %s\n',num2str(iter_count)) fprintf('Last 10 prime numbers found: %s\n',num2str(primes(end-9:end))) end function res = mcpi() res=0; dist = sqrt(rand()^2+rand()^2); if dist <= 1 res=1; end end The 'for' and 'end' words in the locally executable loop have been replaced with the 'cloudfor' and 'cloudend' keywords. These indicate the block of code that will be executed on Workers. The function 'mcpi' will be automatically included in the compilation as it is called within the cloudfor-loop structure. The cloudfor-loop also uses control parameters to manage return values. These parameters are explained below. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 27/74 %cf:sum=pi_counter The control parameter above states that the 'pi_counter' variables returned from the Workers should be summed. This is required as the locally executable program uses the same variable to store and carry over results from previous iterations. %cf:cat=primes The control parameter above states that the 'primes' variables returned from the Workers should be concatenated. This is required as the locally executable program also used concatenation. %cf:stream='false' The control parameter above states that streaming should be disabled. This is required because the concatenated results in the 'primes' vector need to be in the same order as in the locally executable version. %cf:stepsperworker=1e4 The control parameter above states that 10,000 iterations should be performed in each Job. This will means that the Project will consist of 40 Jobs, as the total number of iterations performed is 400,000. The cloudfor version can be executed by changing your current working directory in MATLAB to the directory containing the material for this example and executing the command shown below: [pi_value, primes]=run_managing_results_dist; Return values received from the Jobs will be managed differently, depending on which control parameter was used. The 'pi_counter' values will be summed and the 'primes' values will be concatenated. After the return values have been summed and concatenated, the final results will be displayed in the MATLAB Command Window and should resemble the ones received in the locally executable version. Please note that small deviations in the approximated value of Pi are expected, because the 'mcpi' function uses random numbers and no fixed random number generator seeds are used. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 3 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 28/74 Peach tutorial examples This Chapter contains minimalistic examples on how to implement and control the core features of the peach-function. The example material used this section, including m-files and data files can be found under the following folder in the Techila SDK: techila\examples\Matlab\Tutorial Each of the examples contains three pieces of code: A locally executable function. The locally executable function can be executed locally and will not communicate with the distributed computing environment in any way. A function containing the Local Control Code. This function is executed locally and will be used to distribute the computations in the Worker Code to the distributed computing environment A function containing the Worker Code. This function will be compiled in to a binary and executed on the Workers. This function contains the computationally intensive part of the locally executable function. Please note that the example material in this Chapter is only intended to illustrate the core mechanics related to distributing computation with the peach-function. More information on available features can be found in Chapter 4 and by executing the following command in MATLAB. doc peach ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 3.1 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 29/74 Executing a MATLAB program on Workers This example is intended to provide an introduction on distributed computing using the peach-function. The purpose of this example is to: Demonstrate how to modify a simple, locally executable function to a distributed version, where computational operations will take place on the Workers. Demonstrate the difference between Local Control Code and Worker Code in MATLAB environment Demonstrate the basic syntax of the peach-function in MATLAB environment The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\Tutorial\1_distribution Locally executable function The locally executable function called 'local_function' contains a simple algorithm that consists of a single for-loop. The algorithm of the locally executable function used in this example is shown below. function result = local_function(loops) for i = 1:loops result(i) = 1+1; end The program requires one input argument called 'loops', which defines the number of iterations that will performed in the loop structure. Each iteration performs the same arithmetic operation: 1+1. The result of the latest iteration will get appended to the 'result' vector. The result vector for five iterations is shown below. index result loops=5 1 2 3 2 2 2 4 2 5 2 Distributed version of the program All arithmetic operations in the locally executable function are performed in the for-loop. There are no recursive data dependencies between iterations, meaning that the all the iterations can be performed simultaneously. This is done by placing the computational instructions in the Worker Code in a file called 'distribution_dist.m'. Local Control Code (in the file 'run_distribution.m') will used to create the computational Project. The Worker Code in the 'distribution_dist.m' file is automatically compiled to a binary, which will be placed in the Executor Bundle and transferred to the Workers where the function 'distribution_dist' will be executed. Local Control Code The Local Control Code used to control the distribution process is shown in below. function result = run_distribution(jobs) result = peach('distribution_dist',{},1:jobs); ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 30/74 result = cell2mat(result); end The function 'run_distribution' requires one input parameter. This input parameter will be used to specify the number of Jobs into which the Project should be split. This performed by using the value of the 'jobs' parameter to determine the length of the 'peachvector'. Please see an illustration of the parameter dependencies in Figure 7 below. In this example, the 'params' array is empty, which indicates that no parameters will be transferred to the Worker Code. At the last line of the result vector will be converted to matrix format by using the 'cell2mat' function. This conversion will take place after all Jobs have been completed and the results have been transferred back to the End-Users computer. Figure 7. The peach-function call in the Local Control Code. The input variable received from the End-User determines the values of the 'jobs', variable. This variable is used to define the length of the 'peachvector'. The' params' array is empty, which indicates that no parameters will be transferred to the Worker Code in the 'distribution_dist' function Worker Code The Worker Code that is executed on the Workers is shown below. function result = distribution_dist() result = 1 + 1; end Operations performed in the Worker Code are equivalent to one iteration of the locally executable loop structure. As no input parameters will be transferred to the Worker Code example, identical arithmetic operations are performed during all Jobs. This is illustrated below in Figure 8. Figure 8. The Worker Code is executed on the Workers without any input parameters as indicated by the empty brackets in the peach function call. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 31/74 Creating the Project The Project can be created by executing the Local Control Code using command: result = run_distribution(5) This command will create a computational Project that will consist of five Jobs. Because the 'params' array in the peach-function is represented by empty curly brackets, no input parameters will be transferred to the Workers. The Workers will execute the computational operations in the 'distribution_dist' function without any input parameters and return results to the Techila Server. The interaction between the Local Control Code and Worker Code is illustrated below in Figure 9. Figure 9. The Local Control Code is executed on the End-Users computer. The 'params' array of the peach-function call is empty. This indicates that no parameters will be transferred to the Worker Code. The Worker Code is executed on the Workers without any input parameters. When the Local Control Code is executed using the syntax run_distribution(5), computational operations illustrated in Figure 10 will take place on Workers. Figure 10. The input parameter to the Local Control Code is used to determine the number of Jobs. The same arithmetic operation, 1+1, is performed in each Job. Results are delivered back to the End-Users computer where they are stored in the 'result' vector. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 3.2 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 32/74 Using input parameters This example will demonstrate: How to define input arguments to the program that will be executed on Workers In this example, input arguments will be transferred to the Workers using the 'params' array, which is the second input argument of the peach-function. Generally speaking, several input parameters can be given to the executable program by defining parameters as a comma separated list, which is enclosed in curly brackets. For example, the syntax below defines that variables called 'var1' and 'var2' will be given to the executable program as input arguments. peach('funcname',{var1,var2},1:10); Elements of the 'peachvector' can be given as input arguments by using the '<param>' notation. For example, the syntax shown below defines that third input argument of the executable program ('funcname') should be an element of the 'peachvector'. peach('funcname',{var1,var2,'<param>'},1:10); The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\Tutorial\2_parameters Locally executable function The algorithm for the locally executable function used in this example is shown below. function result = local_function(multip,loops) for i = 1:loops result(i) = multip * i; end This function requires two input parameters; 'multip' and 'loops';'loops' determines the number of iterations that are performed and 'multip' is a number, which will be multiplied by the iteration number 'i'. The result of this arithmetic operation will then be appended to a vector called 'result', which will be returned as the output value of the function. The result vector when five iterations are performed is shown below. multip = 2; loops=5 index 1 2 3 result 2 4 6 4 8 5 10 Distributed version of the program All the computations in locally executable function are performed in the for-loop and there are no dependencies between the iterations. Because of this, the operations performed in the for-loop can be performed on the Workers. This can be achieved by extracting the operations into a separate piece of code, which will be compiled in to an executable binary and sent to the Workers for execution. Input parameters for the executable binary will be transferred with the 'params' array (second input argument) of the peach-function. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 33/74 Local Control Code The Local Control Code used to control the distribution process is shown in below. function result = run_parameters(multip,jobs) result = peach('parameter_dist',{multip,'<param>'},1:jobs); result = cell2mat(result); end The 'run_parameters' function takes two input parameters; 'multip' and 'jobs'. The value of the 'multip' parameter will be identical on all Workers. The value of the 'jobs' parameter will be used to determine the length of the 'peachvector', meaning that the parameter also determines the number of Jobs in the Project. Elements of the 'peachvector' are used as a dynamic input parameter as indicated by the '<param>' notation in the 'params' array. The peach-function syntax in the Local Control Code is illustrated below in Figure 11 Figure 11. The peach-function syntax in the Local Control Code. Parameters listed inside the curly brackets will be transferred to the Worker Code. The ‘<param>’ notation is used to give elements of the peachvector to the Worker Code as input arguments. The value of this parameter will be different in each Job. The value of the jobs variable is defined by the End-User and it is used to define the length of the peachvector. The value of the jobs parameter therefore defines the number of Jobs. Worker Code The Worker Code that is executed on the Workers is shown below. function result = parameter_dist(multip,jobidx) result = multip * jobidx; end The Local Control Code discussed earlier in this Chapter defined two parameters in the 'params' array. One of these parameters was a static parameter and the other was a dynamic parameter. In the Worker Code, the static parameter is being represented by a parameter called 'multip'. 'multip' is constant across all Jobs. The dynamic parameter in the Local Control Code is represented by 'jobidx' parameter, which will get replaced by a different element of the 'peachvector' in each Job. As a result we can say that 'jobidx' parameter simulates the iteration number of the locally executable function. Parameters transferred to the Worker Code are illustrated below in Figure 12 ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 34/74 Figure 12. The Worker Code requires two input parameters. These parameters are defined in the params array of the peach-function call in the Local Control Code. Creating the Project The Project will be created by executing the Local Control Code using command: result = run_parameters(2,5) This will create a computational Project that will consist of five Jobs. The parameters in the 'params' array in the peach-function call will be given values based on the input arguments of the 'run_parameters' function. The Workers execute the 'parameter_dist' function using one static and one dynamic input parameter as illustrated in Figure 13 below. Figure 13. Parameters defined in the params array of peach-function call in the Local Control Code. These parameters are transferred to the Worker Code executed on the Workers. The multip parameter is constant for all Jobs. The value of the jobidx parameter will be replaced by elements of the peachvector. As mentioned above, the Local Control Code can be executed with the following syntax: result = run_parameters(2,5) This will set the value of the static parameter 'multip' to two (2). The peachvector will contain the integers from one to five. These integers are used to define the value of the 'jobidx' parameter in the Worker Code. The computational operations resulting from the syntax shown above are illustrated below in Figure 14. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 35/74 Figure 14. Executing the Local Control Code with the syntax shown in the figure will create a computational Project that consists of five Jobs. The value of the multip parameter is constant, remaining the same for all Jobs. The jobidx parameter is replaced with elements of the peachvector, receiving a different element for each Job. Job results are stored in the result vector, which is returned as the output value of the Local Control Code. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 3.3 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 36/74 Transferring data files This example will demonstrate: How to transfer data files to the Workers In this example, files will be transferred to the Workers using the 'files' array of the peach-function. The data file used in this example is a MAT file called 'datafile.mat'. This file is located in the same directory as the m-files containing the Local Control Code and Worker Code that are used in this example. Generally speaking, data files can be transferred to all workers by defining the file names as comma separated list. For example, the syntax shown below defines that files called 'file1' and 'file2' will be placed in a Data Bundle transferred to all participating Workers. peach('funcname',{},{'file1','file2'},1:10); Sub-arrays can be used to place several files in different Data Bundles. For example, the syntax shown below would transfer four files ('file1','file2','file3' and 'file4') to all participating Workers. The first two files would be placed in one Data Bundle and the last two files in a second Data Bundle. peach('funcname',{}, {{'file1', 'file2'},{'file3', 'file4'}},1:10); The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\Tutorial\3_datafiles Locally executable function The algorithm of the locally executable function used in this example is shown in below. function result = local_function() load datafile.mat for i=1:10 result(i)=mean(datafile(:,i)); end The locally executable function 'local_function' consists of two logical parts; loading input data into memory and performing analysis on this data. The input data is stored in a MAT-file called 'datafile.mat', which contains a 10x10 matrix of integers. The computational part consists of calculating the mean values of each column in the 'datafile' matrix. Each iteration of the loop structure calculates the mean value of a single column vector, meaning that the number of iterations is fixed to ten (10). Distributed version of the program The distributed version consists of the same logical parts as the locally executable function; loading input data and performing some simple analysis on the data. This data is stored in the 'datafile.mat' file, which will be transferred to Workers with the 'files' array, which is the third input argument of the peach-function. Local Control Code The Local Control Code that is used to control the distribution process is shown below. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 37/74 function result = run_data_files() jobs=10; result=peach('data_dist',{'<param>'},{'datafile.mat'},1:jobs); result = cell2mat(result); end The Local Control Code 'run_data_files' requires no input parameters; the number of Jobs is defined in control code by the variable 'jobs'. This is done by using the parameter to define the length of the 'peachvector'. Elements of the 'peachvector' are also used as a dynamic input parameter in the 'params' array as indicated by the '<param>' notation. The 'files' array contains the string 'datafile.mat'. This is the name of the file that will be transferred to all Workers. The peach-function syntax used in the Local Control Code is illustrated in Figure 15 below. Figure 15. The project is created by the peach-function call in the Local Control Code. The datafile.mat file is transferred to all Workers participating in the Project. The Worker Code will be provided one dynamic input parameter, which will be replaced with a different element of the peachvector for each Job. Worker Code The Worker Code that is executed on the Workers is shown below. function result=data_dist(jobidx) load('datafile.mat') result=mean(datafile(:,jobidx)); end The Local Control Code introduced earlier in this Chapter defines one dynamic input parameter. This is represented in the Worker Code by 'jobidx' parameter, which will get replaced by a different element of the 'peachvector' in each Job. This means that the 'jobidx' can be used to point to the correct column vector during each Job. The Local Control Code also introduced one (1) filename in the 'files' array. This file will be copied to the same temporary working directory at the Worker with the executable binary, which means that the file 'datafile.mat' can be loaded into memory using the same syntax as in a locally executable function. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 38/74 Figure 16. The Worker Code. All the computational data required is copied to a temporary directory on the Worker. Data files can be manipulated with the same instructions as in the locally executable function introduced earlier in this chapter. Creating the computational project The project can be created by executing the Local Control Code using command: result = run_data_files() The number of Jobs in the Project will be automatically fixed to ten as the value of the jobs parameter is defined in the Local Control Code. Parameters in the 'params' array and the file specified in the 'files' array will be transferred to the Workers. During each Job, the executable binary will executed with one input argument, which will be a different element of the 'peachvector' for each Job. The 'datafile.mat' file can be accessed in a simolar manner as in the locally executable version, because the file has been copied to the same temporary working directory with the executable binary. The interaction between the Local Control Code and Worker Code is illustrated in Figure 17 below. Figure 17. Parameters in the params array will be transferred to the Worker Code. The datafile.mat file will be transferred to the same temporary directory with the Worker Code. The syntax for loading the datafile.mat will be the same as in the locally executable function. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 3.4 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 39/74 Distributing nested loops Locally executable programs can often contain nested for-loops, where the computationally intensive part is located in the innermost for-loop. In such scenarios, the peachvector can be used to as pointer to simulate the values of loop counters of inner and outer loops. An example of a nested loop structure is shown below in pseudo-code: for i = 1:x for j = 1:y operation(i,j) end end % Outer Loop % Inner Loop % Computation The nested loop structure in the example shown above contains two for-loops, an outer and an inner loop. The computationally intensive operations are located inside the inner for-loop and the values of the loop counters are used as input parameters in the computationally intensive operations. The combinations of the inner and outer loop counters is illustrated below in Figure 18 Figure 18. Sample combinations of the loop counters in a nested loop structure. As can be seen from Figure 18, the values of the inner and outer loop counters can also be thought to correspond to the row and column subscripts, 'i' and 'j', of a 2-D matrix. The linear equivalent of these subscripts can be generated by using MATLAB’s 'sub2ind' function. This converts the representation so it is expressed using a single value, meaning that the peachvector can be used to refer to the correct value. The linear indexes can also be converted back to subscript format with the 'ind2sub' command. Figure 19. Subscripts can be converted to linear indexes with the sub2ind command. These values can be thought to correspond to a jobidx value in a distributed version of the program. These values can be converted back to subscript format with the ind2sub command. The following example shows how to distribute a locally executable function that has a nested loop structure. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 40/74 The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\Tutorial\4_nestedloops Locally executable function The algorithm of the locally executable function used in this example is shown below. function result = local_function() A = [2 4 6; 8 10 16; 14 16 18]; % Create matrix A Imax=size(A,1); % Number of rows Jmax=size(A,2); % Number of columns for i = 1:Imax for j = 1:Jmax result(i,j) = A(i,j)*2; % Multiply each element individually end end The function requires no input parameters. Each element of the matrix A is multiplied separately inside a nested loop structure. The result of the operation will be stored in the result-matrix at the subscript indices determined by the values of the loop counters. Local Control Code The Local Control Code used to distribute computations that occur within a nested loop in this example is shown below. function result = run_nested_loop() A = [2 4 6; 8 10 16; 14 16 18]; % Create matrix A siz=size(A); % Get the size of the matrix jobs=siz(1)*siz(2); % Set number of jobs equal to the amount of elements result=peach('nested_dist',{'<param>',siz,A}, 1:jobs); % Create the project result=reshape(cell2mat(result),siz); %Reshape to a 2-D matrix end The 'params' array contains three parameters; one of the parameters is dynamic and two of the parameters are static. Parameter 'siz' contains the size of the matrix and will be used to convert 'jobidx' to subscript format on the Workers. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 41/74 Worker Code The Worker Code that is executed on the Workers is shown below. function result = nested_dist(jobidx,siz,A) %Convert the jobidx value to column and row coordinates [i,j] = ind2sub(siz, jobidx); % Multiply the matrix element. This result is returned to the server. result=2 * A(i,j); end Each Job will operate on one element of the two dimensional matrix. The value of the 'jobidx' parameter will determine which matrix element will be operated on. This is will be done by converting the 'jobidx' parameter to subscript format and using it to point to the correct matrix element. Creating the computational project The project can be created by executing the Local Control Code using command: result = run_nested_loop() The Project will automatically be divided into nine Jobs, as the value of the 'jobs' parameter was defined in the Local Control Code. Parameters defined in the 'params' array will be transferred to the Workers. Elements of the peachvector (stored in the 'jobidx' variable) parameter will be converted to subscript format on the Workers by using the inbuilt 'ind2sub' function. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 3.5 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 42/74 Local Testing It is possible to test Worker Code locally. This means that it is not necessary to create a separate project for testing purposes. Local testing can be done as follows. 1. Set a break point in the Local Control Code on the line containing the peach-function call 2. Run the Local Control Code as you would when distributing computations to the distributed computing environment. The Local Control Code execution will break at the breakpoint. MATLAB will also automatically enter to debug mode at this point. The MATLAB workspace will now contain the same variables that are available, as the peach-function call will be performed. 3. If you are using peachvector elements as input arguments ('<param>' notation), specify a value to the input arguments that it will correspond to an element of the 'peachvector'. 4. Execute the Worker Code locally by calling the function using the desired parameters. Ensure that the code will function properly and produces expected results. This procedure can be tested for example using the code illustrated in Chapter 3.4: 1. Open the file 'run_nested_loop.m' with the MATLAB Editor and set a break point on the line shown below: result=peach('nested_dist',{'<param>',siz,A}, 1:jobs); 2. Execute the following command in MATLAB: result = run_nested_loop Execution will break at the line containing the breakpoint. The workspace will contain the following variables: A jobs siz 3. Give a value for the 'jobidx' parameter using command K>> jobidx=1 4. Execute the Worker Code using command: K>> result=nested_dist(jobidx,siz,A) Assuming 'jobidx' was set to 1, this will produce following result: result = 4 This corresponds to the computational operations that would take place on the Worker that computes the first Job in the Project. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 4 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 43/74 Peach feature examples The basic methodology and syntax of distributing computations using the MATLAB peach-function was shown in the Tutorial in Chapter 3. In addition to the core features used in the Tutorial, there are a wide range of optional features, such as Snapshotting, Streaming and Job Input Files. This Chapter contains examples on how to implement these and several other features. The example material discussed in this Chapter can be found in example specific subfolders in the following Techila SDK folder: techila\examples\Matlab\Features\<example specific subfolder> Please note that the example material discussed in this Chapter does not contain examples on all available peach-features. For a complete list on available features, execute the following command in MATLAB: doc peach Monte Carlo Method A Monte Carlo method is used in several of the examples for evaluating the value of Pi. This section contains a short introduction on the Monte Carlo method used in these examples. The Monte Carlo method is a statistical simulation where random numbers are used to model and solve a computational problem. This method can also be used to approximate the value of Pi with the help of a unit circle and a random number generator. Figure 20. A unit circle has an area equal to . The square surrounding a unit circle has an area equal to 4. The area of the unit circle shown in Figure 20is determined by the equation and the area of the square surrounding it by the equation . This means the ratio of areas is defined as follows: When a random point will be generated, it can be located within or outside the unit circle. When a large number of points are being generated with a reliable random number generator, they will be spread evenly over the square. As more and more points are generated, the ratio of points within circle compared to the total number of points starts to approximate the ratio of the two areas. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 44/74 For example, in a simulation of 1000 random points, the typical number of points within the circle is approximately 785. This means that the value of Pi is calculated in the following way. Algorithmic approaches are usually done only using ¼ of a circle with a radius of 1. This is simply because of the fact that number generating algorithms on many platforms generate random numbers with a uniform(0,1) distribution. This does not change the approximation procedure, because the ratios of the areas remain the same. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 4.1 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 45/74 Distributing Monte Carlo Pi with the peach-function This example will demonstrate: Approximation of the value of Pi using Monte Carlo method Converting of a locally executable Monte Carlo method to a distributed version The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\Features\basic_monte_carlo_pi Locally executable function The locally executable function for approximating the value of Pi used in this example is shown below. function result = local_mcpi(loops) count = 0; for i = 1:loops if ((sqrt(rand() ^ 2 + rand() ^ 2)) < 1) count = count + 1; end end result = 4 * count/loops; end The function requires one input argument that determines the number of iterations in the for-loop. During each iteration, two random values will be generated. These will be used as the coordinates of the random point. The coordinates of the point are then used to calculate the distance of the point from the center of the unit circle. If the distance is less than one, the point is located within the unit circle and the counter is incremented by one. As soon as all iterations have been completed, the value of Pi will be calculated. The locally executable version of Monte Carlo Pi can be executed using command pivalue = local_mcpi(10000000) This will calculate the approximated value of Pi using 10,000,000 randomly generated points. Distributed version of the program The computationally intensive part in Monte Carlo methods is the random number sampling. The sampling is performed in the for-loop in the locally executable function. There are no dependencies between the iterations. This means that the sampling process can be divided into a separate function and executed simultaneously on several Workers. Note that when using the peach-function, the seed of the random number generator is initialized automatically on the Workers by the peachclient-wrapper. If you wish to use a different seeding method, please seed the random number generator directly in the code you wish to execute on Workers. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 46/74 Local Control Code The Local Control Code used in this example to control the distribution process is shown below. function result = run_mcpi(jobs, loops) result = peach('mcpi_dist', {loops}, 1:jobs); result = sum(cell2mat(result))* 4 / (loops * jobs); end The Local Control Code contains two lines having distinctly different roles. The line containing the peachfunction call is responsible for creating the computational Project to the Techila environment. The last line is executed after the project has been completed and contains the instructions necessary for postprocessing procedures. The peach-function call defines that the function 'mcpi_dist' will be deployed and executed on Workers. The 'mcpi_dist' function will receive one input argument called 'loops', which is defined in the 'params' array. The 'peachvector' is only used to control the number of Jobs in the Project. Worker Code The code that is executed on Workers in this example is shown below. function result = mcpi_dist(loops) result = 0; %No random points generated yet, init to 0. for i = 1:loops %Monte Carlo loop from 1 to loops if ((sqrt(rand() ^ 2 + rand() ^ 2)) < 1) % Point within the circle? result = result + 1; % Increment if the point is within the circle. end end end The algorithm is very similar to the algorithm of the locally executable function. The function requires one input argument called 'loops' which is being used to determine the number of iterations. During each iteration, the distance of a randomly generated point from the centre will be calculated. If the distance is less than one, the point is within the unit circle and the count is incremented by one. The only differentiating factor is that the post-processing activities, which otherwise would take place after the forloop, are not being implemented. Creating the computational project The computational Project can be created with the command shown below: result = run_mcpi(10, 1000000) This will create a Project consisting of ten Jobs where each of the Jobs will perform 1,000,000 iterations. The Jobs are distributed to Workers, where the Monte Carlo routine in the Worker Code is executed. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 47/74 When a Worker finishes the Monte Carlo routine, it sends the results to Techila Server. After all the Workers have transferred the results to the Techila Server, the results are transferred to the End-Users computer. After the results have been downloaded, the last line in the control code is executed which contains the post-processing operations, which in this case consist of scaling the results according to the number of performed iterations. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 4.2 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 48/74 Snapshot Snapshotting is a mechanism where intermediate results of computations are stored in snapshot files and transferred to the Techila Server at regular intervals. Snapshotting is used to improve the fault tolerance of computations and to reduce the amount of computational time lost due to interruptions. Snapshotting is done by storing the state of the computation at regular intervals in snapshot files on the Worker. The snapshot files will then be transferred over to the Techila Server at regular intervals from the Workers. If an interruption should occur, these snapshot files will be transferred to other available Workers, where the computational process can be resumed by using the intermediate results stored in the Snapshot file. The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\Features\snapshot Snapshotting in MATLAB Snapshotting is enabled by default when using the peach-function. This means that the Local Control Code will not require any changes, if you use the default values set for the snapshot routine: The default name for the snapshot file that will be transferred to the Techila Server is 'snapshot.mat'. Using this default name will enable you to use the 'saveSnapshot' and 'loadSnapshot' helper functions without any additional parameters. If you choose to name your snapshot file differently, please see the 'saveSnapshot' and 'loadSnapshot' help for defining the file name. Help can be displayed with commands: help loadSnapshot help saveSnapshot The default snapshot file transfer interval for from Workers to Techila Server is 15 minutes. If you wish to use custom filenames for the snapshot file, you can specify the name using following parameter pair in the peach-function call. 'SnapshotFiles', 'filename1, filename2' If you wish to use a custom transfer interval, you can specify the interval using following parameter pair. 'SnapshotInterval', <minutes> When using default name of the snapshot file, workspace variables can be stored using command: saveSnapshot('var1',var2',...) The 'saveSnapshot' function also has a built-in control mechanism that sets the frequency often intermediate results will be stored in the snapshot file. This means that Snapshots will automatically be generated at a suitable frequency. If more control over the snapshot generation frequency is required, the 'saveSnapshot' function can be placed in an 'if' clause: if condition is true saveSnapshot(‘var1’,var2’,...) end When using the default name for the Snapshot file, intermediate results stored in a Snapshot file can be loaded using command: ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 49/74 loadSnapshot() The implementation of the Snapshot feature is demonstrated using the Monte Carlo Pi example. Local Control Code The Local Control Code to create a Project using Snapshotting with default values is shown below. function result = run_snapshot(jobs, loops) result = peach('snapshot_dist', {loops}, 1:jobs); result = sum(cell2mat(result))* 4 / (loops * jobs); end When using the default values for Snapshotting, the Local Control Code will not require any changes. The two parameters controlling the Snapshotting process, the name of the snapshot file and the upload interval will be set to default values inside the peach-function. Worker Code As the Local Control Code did not specify any custom settings for Snapshotting, the procedure will use default values. The upload procedure will be automatic, but the Worker Code will need to be modified so that it the program can initialize properly if resuming computations from a Snapshot file. The Worker Code will also need to be modified to generate snapshot files at a frequency, which serves the purposes of the code. Modified Worker Code using Snapshotting is shown below. function result = snapshot_dist(loops) result = 0; iter=1; %Init: No random points generated yet, init to 0. %Init: No iterations have been performed yet, init to 1. loadSnapshot; %Override Init values if snapshot exists for iter = iter : loops; %Monte Carlo loop from iter to loops if ((sqrt(rand() ^ 2 + rand() ^ 2)) < 1) result = result + 1; end if mod(iter,1e8)==0 %Snapshot every 1e8 iterations saveSnapshot('iter','result') % Save intermediate results end end Initialization of the Worker Code will set values for two variables, 'result' and 'iter'. These are initialized values, which will be used if no Snapshot files can be found. If a Snapshot files exists, it will indicate that the Job is being resumed after an interruption. In this case, the content of the Snapshot file will be used to override the initialized values. This will be done using the 'loadSnapshot' function, ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 50/74 which automatically loads the contents of the Snapshot file to the workspace. Iterations will be resumed from the last value stored in the Snapshot file. Intermediate results will be stored in the Snapshot by calling the 'saveSnapshot' function every 1e8th iteration. The variables stored in the snapshot file are 'iter' and 'result'. 'iter' will contain the number of iterations performed until the snapshot generation occurred and 'result' will contain the intermediate result. Creating the computational project The Project can be created by executing the Local Control Code using command: result = run_snapshot(10, 1e9) This will create a Project consisting of 10 Jobs, where each Job will consist of 1e9 iterations. Intermediate results will be saved at every 1e8th iteration. Snapshot files will be transferred every 15 minutes from the Worker to Techila Server. If a Job is migrated to a new Worker while the Job is being computed, the latest available Snapshot file will be automatically transferred from the Techila Server to the new Worker. Snapshot data can be viewed and downloaded by using the Techila Web Interface. Instructions for this can be found in the Techila Web Interface End-User Guide. Note that when using the syntax shown above to run the example, the execution time of single Job is relatively short. This might result in the Job being completed before a Snapshot file is transferred to the Techila Server. If Snapshot data cannot is not visible in the Techila Web Interface, consider increasing the amount of iterations to increase the execution time of a Job. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 4.3 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 51/74 Streaming & Callback Function Streaming enables individual results to be transferred as soon as they become available. This is different from the default implementation, where all the results will be transferred in a single package after all of the Jobs have been completed. The Callback function enables results to be handled as soon as they have been streamed from the Techila Server to End-User. The Callback function is called once for each result file that will be transferred from the Techila Server. Example presented in this Chapter uses Streaming and Callback functions. The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\Features\streaming_callback Streaming in MATLAB Streaming is disabled by default. Streaming can be enabled using following parameter pair: 'StreamResults', 'true' Callback Function in MATLAB A function can be used as a Callback function by defining the name of the function using following parameter pair: 'CallbackMethod', @name_of_the_function The function will then be called every time a new result file has been streamed from the Techila Server to End-User. By default, the Callback function will get the name of the result file as input parameter. The input parameter can be defined to be a struct containing the contents of the result file using following parameter pair: 'CallbackParams', {'<*>'} The input parameter can also be defined to be a value in the result file using the following syntax. 'CallbackParams', {'<parameter>'} Where <parameter> is the name of a variable. The Callback function can also be given input parameters directly from the Local Control Code. For example, when the function workspace of the Local Control Code contains the variables A and B, they can be defined as input parameters for the Callback function with the following syntax: 'CallbackParams', {A,B} Values returned by the Callback Function will be the values of the result vector returned by the peachfunction. Since new values are appended to the result vector in the order in which Jobs are being completed, values in the result vector will be in a random order. The implementation of the Streaming and Callback features will be demonstrated using the Monte Carlo Pi method. In the distributed version of the program, Job results will be streamed as soon as they have ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 52/74 become available. The Callback function is used to plot the error of the approximation in a continuous manner on a chart. Local Control Code The Local Control Code of the Monte Carlo Pi where a Callback function and Streaming are being used is shown below. function result = run_stream(jobs,loops) global total;global rj;global data;global fig; data = inf(1,jobs); total = 0; rj = 0; figure(1); fig=plot(data, 'YDataSource', 'data'); title('Amount of error in the Pi approximation'); ylabel('Amount of error'); xlabel('Number of Job result files processed'); drawnow; result = peach('mcpi_dist', {loops}, 1:jobs, ... 'StreamResults', 'true', ... % Enable streaming 'CallbackMethod', @summc, ... % Name of the Callback function 'CallbackParams', {loops, '<result>'}); % Parameters for the CB function end function result = summc(loops, val,) % Callback function global data; global fig; global rj; global total; rj = rj + 1; % One more result file received total = total + val; % Add new results to old results pivalue = total * 4 / (rj * loops); % New approximate value of Pi error = pivalue - pi; % Calculate the error result = error; data(rj) = abs(result); if (mod(rj,10) == 0) % Update figure when 10 more results are available refreshdata(fig, 'caller'); drawnow; end end The Local Control Code here consists of two functions, run_stream and summc. The run_stream function will distribute the computations using the peach-function. The summc-function is the Callback function that will be executed every time a new result will get streamed from the Techila Server to the EndUser. The variables used in the Callback function are declared as global in both functions, meaning that all functions can access these variables. 'CallbackMethod' is used to define the name of the Callback function used in handling the results. This example will call 'summc' function for each result. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 53/74 'CallbackParams' is used to define the input arguments of the Callback function. In this example, the Callback function will be given the value of the variable 'loops' as the first argument, which will define the number of iterations that will be performed in each Job. The second input argument will be defined by '<result>'. This is a notation that tells the Callback function to use the output value of the Job as its input argument. In this example, ‘<result>' will be replaced by the value of the result variable, which is being returned by the Worker Code. The Callback function 'summc' contains the arithmetic operations to continuously update the approximated value of Pi. This approximated value is then compared to the Pi value in MATLAB to determine the absolute error. The value of this error will be plotted on a graph every time when 10 new results have been received and have been processed. Worker Code Streaming does not require any modifications to the Worker Code that will be executed on the Workers. The code is identical to the basic Monte Carlo Pi implementation presented in 4.1 and is shown below. function result = mcpi_dist(loops) result = 0; for i = 1:loops if ((sqrt(rand() ^ 2 + rand() ^ 2)) < 1) result = result + 1; end end Creating the computational project The computational Project can be created by using command: result = run_stream(100,200000) This will create a project consisting of 100 Jobs, where every Job will perform 200,000 iterations. Results will be streamed from the Techila Server to End-User as they are completed and the graph will be updated every time 10 new results have been post-processed. The interaction between the Local Control Code and results received from the Workers is illustrated in Figure 21 below. Figure 21. The value of the loops-parameter is defined when the Local Control Code is called. The 'loops' parameter is also listed as a parameter in ‘callbackparams’, meaning the parameter is transferred to the Callback function. The second input argument of the callback function 'summc' called 'val' is replaced with the value of the 'result' variable, which is received from Workers. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 4.4 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 54/74 Job Input Files Job Input Files allow using Job-specific input files and can be used in scenarios, where individual Jobs require only access to some files within the dataset. Job-specific Input Files are stored in a Job Input Bundle and will be transferred to the Techila Server. Techila Server will transfer files from the Bundle to the Workers requiring them. These files will be stored on the Worker for the duration of the Job. The Jobspecific Input Files will be removed from the Worker as soon as the Job has completed. The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\Features\job_input_files Job Input Files in MATLAB A list of Job-specific Input Files will be specified in 'JobInputFiles' parameter as an array of cells: 'JobInputFiles', {{files_for_job_1}, {files_for_job_2}} A Job can get a single Job-specific Input File as shown below: 'JobInputFiles', {{'job_1_input_1'},{' job_2_input_1'},...} A Job can also get several Job-specific Input Files as demonstrated below: 'JobInputFiles', {{'job_1_input_1', 'job_1_input_2'},{'job_2_input_1', 'job_2_input_2'},...} The names used for the Job-specific Input Files on a Worker can be specified using the 'JobInputFileNames'. If a single Job-specific Input File was defined with 'JobInputFiles', the name of the file is defined as shown below: 'JobInputFileNames', {'filename1'} If a Job should get several Job-specific Input Files, their names will be specified as follows: 'JobInputFileNames', {'filename1','filename2',…} 'JobInputFileNames' is an optional parameter. If the parameter is not being used, the names will default to 'jobinputfile1.mat' for the first file, 'jobinputfile2.mat' for the second, and so on. The use of Job input Files is illustrated using four images. Every image has a resolution of 500x500 pixels and features a quarter circle consisting of black and white pixels. The input files are being illustrated below. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 55/74 Each Job will analyse one input file by calculating the number of black pixels. The ratio of black and white pixels will then be used to approximate the value of Pi. The computational work performed in this example is trivial and is only intended to illustrate the mechanism of using Job-Specific Input Files. Local Control Code The Local Control Code for creating a project that uses Job-Specific Input Files is shown below. function result = run_inputfiles() jobs = 4; result=peach('inputfiles_dist',{},1:jobs,... 'JobInputFiles', {{'input1.png'},{'input2.png'},{'input3.png'},{'input4.png'}},... 'JobInputFileNames',{'quadrant.png'}); result = 4*(sum(cell2mat(result))/1e6); end The 'JobInputFiles' parameter specifies files, which should be used in each Job. The syntax used in this example is shown below: 'JobInputFiles',{{'input1.png'},{'input2.png'},{'input_3.png'},{'input4.png'}} This syntax assigns one input file for each Job. The file 'input1.png' will be transferred to a Worker together with Job #1, 'input2.png' is transferred together with Job #2 and so on. The number of entries in the Job Input File list is equal to the number of elements in the 'peachvector'. The 'JobInputFileNames' parameter specifies the names of the Job Input Files on the Workers. The syntax used in this example is shown below: 'JobInputFileNames', {'quadrant.png'} This syntax assigns the name 'quadrant.png' to all Job-Specific Input Files. Worker Code Worker Code used to analyse the individual input files is shown below. function result = inputfiles_dist() inputdata = imread('quadrant.png'); result = length(find(inputdata==0)); end In this example, all the Jobs access their input files by using the file name 'quadrant.png'. Each Worker then calculates the number of black pixels in their image by examining the intensity values of each pixel. The number of black pixels is returned as the result. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 56/74 Creating the computational project The computational Project can be created by using command: result = run_inputfiles This will create a Project consisting of four Jobs. The system will automatically assign a Job-specific Input File to each Job, according to the file list specified in the Local Control Code. This is illustrated below in Figure 22. Figure 22,Transferring Job-specific Job Input files. All of the files are transferred to the Techila Server. The Techila Server transfers the requested Job Input File for each job. These files are renamed on the Workers according to the parameters in the Local Control Code. In this example, the files are renamed to quadrant.png and copied to a temporary working directory on the Workers. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 4.5 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 57/74 Precompiled Binaries The peach-function can also be used to distribute and execute precompiled binaries on the Workers. When using precompiled binaries, the peachclient wrapper will not be used to transfer input parameters for the Jobs. Input parameters for the binaries will need to be defined in the Project parameters. These input parameters will be passed directly to the executable binary using the %P() notation. The names of output files will also need to be specified separately and passed to the binary using the %O() notation. The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\Features\precompiled_binaries Binaries in MATLAB The syntax for defining input parameters for precompiled binaries is shown below: 'ProjectParameters',{{'param_1', param_1},{'param_2', param_2}}); Parameters defined in 'ProjectParameters' are given to the binary by referencing to them in the params array with the %P() notation. The 'jobidx' parameter can be referenced without listing it in the Project parameters. This is because the 'jobidx' parameter is created automatically (by the Job splitter) on the Techila Server and is included automatically with each Job. The platform of the binary is defined with the 'Binaries' parameter: 'Binaries',{{'binary_name',’operating_system','processor_architecture'}} The 'Binaries' parameter can be used to define different binaries for different operating systems and processor architectures. For example, a different binary for 32-bit Windows and Linux Workers can be defined with the syntax shown below: 'Binaries',{{'Windows_binary_name',’Windows','x86'},{'Linux_binary_name','Linu x','i386'}} Note that when the 'Binaries' parameter is used, the 'funcname' parameter of the peach-function call will not define the name of the executable function. The names of output files are defined with the following parameter pair. 'OutputFiles', {'List of Files'} A single output file can be defined with: 'OutputFiles', {'name_of_outputfile'} Output files defined in the 'OutputFiles' parameter are given directly to the binary by referencing to them with the %O() notation in the 'params' array. The mechanism for distributing binaries is demonstrated with a precompiled version of the Monte Carlo Pi written in C. The binary is provided for two platforms: 32-bit Windows and Linux. The name of the binary is mcpi.exe for Windows and mcpi for Linux. The executable binary take three input arguments; two of ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 58/74 these will be used to control the Monte Carlo routine and the third will specify the name of the output file. The generic syntax is: mcpi jobidx loops output The input arguments are: jobidx. Initializes the random number generator seed loops. Determines the number of iterations output. Determines the name of the output file The binaries can also be executed locally. To execute the Windows binary locally on a computer having a Windows operating system, follow the steps listed below: 1. Open a Command Prompt 2. Change the working directory to the directory containing the mcpi.exe/mcpi files 3. Execute the program using command: mcpi 1 100000 data This executes the binary and performs 100,000 iterations of the Monte Carlo Pi routine. The results will be stored in a file called data. The data file will store two values; the number of points that were located inside the unit circle and the number of iterations. Local Control Code The code used to control the distribution process of the precompiled binaries is shown below. Note that the 'funcname' parameter can be used quite freely in this example. This is because the names of the executable binaries are defined with the 'Binaries' parameter. function result = run_binary(jobs,loops) result=peach('Precompiled Binary', {['%P(jobidx) %P(loops) %O(output1)']}, 1:jobs,... 'Binaries', {{'mcpi.exe',Windows','x86'},{'mcpi','Linux','i386'}},... 'Executable','true',... 'MatlabRequired','false',... 'CallbackMethod', @getdata,... 'ProjectParameters',{{'loops',loops}},... 'OutputFiles',{'data'}); result=4 * sum(cell2mat(result)) / (loops * jobs); end function result = getdata(f) result = load(char(f)); result = result(1); end The Local Control Code consists of two functions; 'run_binary' and 'getdata': ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 59/74 'run_binary' is used to distribute the computations. This function contains the peach-function call and returns the output value of the post-processed results. 'getdata' is a Callback function that is used read the contents of result files after they have been downloaded from the Techila Server. The peach-function call contains the following parameters: 'Binaries', {{'mcpi.exe',Windows','x86'},{'mcpi','Linux','i386'}} Above parameter specifies that the name of the binary for Windows Workers is 'mcpi.exe' and 'mcpi' for Linux Workers. 'Executable','true' Above parameter specifies that 'funcname' parameter refers to a precompiled binary 'MatlabRequired','false' Above parameter specifies that a MATLAB Runtime Bundle will not be required. 'CallbackMethod', @getdata Above parameter specifies that the name of the callback function is 'getdata'. This function will be used to load the contents of the output files and retrieve the values returned from the Jobs. 'ProjectParameters',{{'loops',loops}} Above parameter specifies the 'loops' variable as a Project parameter, making it available for the executable binary as a Job input parameter. This input parameter is used by referring to it in the 'params' array with the %P(loops) notation. 'OutputFiles',{'data'} Above parameter specifies the name of the output file, making it available to the binary by referring to it in the 'params' array. The 'params' array in the peach-function call is shown below: {['%P(jobidx) %P(loops) %O(output1)']} This defines two input parameters and one output file for the 'mcpi' executable. The way in which these input parameters are transferred to the binary is illustrated in the Figure 23 below. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 60/74 Figure 23. The value of the loops parameter will be defined by the End-User when executing the Local Control Code. The loops parameter is also defined in ‘ProjectParameters’, which means that it can be referenced to with the %P() notation in the params array. The name of the output file is defined with ‘OutputFiles’ and given directly to the binary with %O() notation. The value of the jobidx parameter is received directly from the Techila Server, meaning it does not have to be defined in ‘ProjectParameters’. Worker Code Each Worker performs the Monte Carlo routine by executing the precompiled 'mcpi.exe' or 'mcpi' binary. The binary is executed according to the input arguments that were defined in the peach-function 'params' array. Creating the computational project The project can be created by executing the Local Control Code result = run_binary(10,100000) This will create a project consisting of 10 Jobs. During each Job, the 'mcpi' binary will be executed and will perform a Monte Carlo routine that consists of 100,000 iterations. After the Monte Carlo routine has been completed, the result of the approximation will be stored a file called 'data'. This file will be returned from the Workers to the Techila Server. After all Jobs have been completed, the output files will be transferred to your computer. After the output files have been transferred to your computer, each file will be processed by using the getdata-function. This function will be used to load each output file and retrieve the first value from the file, which will contain the number of points that were inside the unit circle for each Job. This value will be returned from the Callback-function, meaning it will also be returned by the peach-function as an element of the result-vector. After all output files have been processed, the values in the result-vector will be used to calculate the approximated value of Pi. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 4.6 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 61/74 MEX files MEX files are MATLAB executables, which can be used to access a large number of existing C, C++ or FORTRAN routines directly from MATLAB, without having to rewrite them as M-files. MEX files can be included in the computational project by defining the names of the source files during project creation. The source files will then be automatically compiled in to MEX files and transferred to the Workers. The use of MEX files is illustrated using the Monte Carlo routine, where the computationally intensive part has been written in C and has been compiled into a MEX file called 'mcpi.c'. The MEX accepts two input parameters; the first input parameter is used to specify the number of iterations, the second input parameter is used to seed the random number generator. The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\Features\mex_files MEX files in MATLAB MEX files can be included in the compilation using following parameter pair: 'MexFiles', {{'file1.c', 'requirement1.c}, 'file2.c'} This parameter compiles the listed source files into MEX files. The compiled files are automatically included with each Job in the project. Local Control Code The Local Control Code to include MEX files in a computational Project is shown below. function result = run_mex(jobs,loops) result = peach('mcpi_wrapper',{loops,'<param>'},1:jobs,... 'MexFiles',{'mcpi.c'}); result = sum(cell2mat(result)); result = 4 * result / (jobs * loops); end The use of MEX files is defined with the following parameter: 'MexFiles',{'mcpi.c'} The parameter pair defines that the file named 'mcpi.c' is required in the computations and needs to be compiled and transferred with each Job. This file is located in the same directory as the 'mcpi_wrapper.m' file and other files related to this example. Worker Code The Worker Code used in this example is shown below. function result = mcpi_wrapper(loops,jobidx) result=mcpi(loops,jobidx); end ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 62/74 The executable function 'mcpi_wrapper' acts as a wrapper for calling the MEX file. The syntax for the calling the MATLAB executable is the same as it would be calling a built-in function. After the MEX has been executed, the return value will be stored in the 'result' variable which in turn will be returned from the Worker. MEX code The MEX code is shown in below. #include <string.h> #include <math.h> #include "mex.h" void mexFunction(int nlhs, mxArray *plhs[ ],int nrhs, const mxArray *prhs[ ]) { int j, loops, jobidx; double *output, x, y, c; // Create the output array plhs[0] = mxCreateDoubleMatrix(1,1,mxREAL); output = mxGetPr(plhs[0]); //Get input integer values loops = (int)(mxGetScalar(prhs[0])); //The number of iterations jobidx = (int)(mxGetScalar(prhs[1])); //Used to initialize the RNG //Init the random number generator srand(time(0) * jobidx * 12837); for (j = 0; j < loops; j++) //Monte Carlo approximation { x = ((float) rand()/RAND_MAX); y = ((float) rand()/RAND_MAX); c = sqrt(pow(x,2)+pow(y,2)); if (c < 1) { output[0]++; } } } The algorithm in the MEX file requires using two input parameters: 'jobidx' is used to initialize the random number generator seed and 'loops' to determine the number of iterations done in the Monte Carlo routine. The results are stored in a variable named 'output', which will be returned as the output value of the 'mcpi_wrapper' function. Creating the computational project The project can be created using command: result = run_mex(10,1000000) ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 63/74 This will create a Project consisting of 10 Jobs, each Job performing 1,000,000 iterations. The 'mcpi.c' file will be automatically compiled and transferred to Workers. The Monte Carlo approximation will be done by executing the Monte Carlo algorithm in the MEX file according to the values defined in the 'params' array. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 4.7 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 64/74 Project Detaching When a Project is detached, the peach-function returns immediately after all of the computational data has been transferred to the Server. This means that MATLAB does not remain in “busy” state for the duration of the Project and can be used for other purposes while the Project is being computed. Results of a Project can be downloaded after the Project has been completed. The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\Features\detached_projet Project detaching in MATLAB Projects can be detached using following parameter: 'DoNotwait', 'true' This will cause the peach-function to return immediately after the Project has been created and all computational data transferred to the Techila Server. The peach-function call will return the Project ID number, which can be used in the download process. Results can be downloaded by linking the peach-function call to an existing Project ID number using following parameter pair: 'ProjectId', projectid It is also possible to download results of previously completed Project (assuming the results have not been removed from the Techila Server), even when the original Project as not detached with the 'DoNotwait' parameter. Results belonging to such Projects can be downloaded by defining the Project ID number as the value of the 'projectid' parameter. For example, the following syntax would link the peachfunction call to Project 1234. 'ProjectId', 1234 Following example demonstrates detaching a Project and downloading results using a peach-function call. Local Control Code The Local Control Code for creating a Detached Project is shown below. function projectid = run_detached(jobs,loops) projectid = peach('mcpi_dist', {loops}, 1:jobs, ... 'DoNotwait', 'true'); end The project is detached with the parameter pair, 'DoNotwait', 'true' ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 65/74 This parameter pair causes the peach-function to return immediately after the Project has been created. The variable 'projectid' will contain the Project ID number of the Project that has been created. This Project ID number can be used to download the results. As soon as the Project has been completed the results can be downloaded by performing a peachfunction call. The peach-function call is linked to an existing Project ID. The algorithm of the function used to download the results is shown below. function result = download_pi_results(projectid) result=peach('',{},1,... 'ProjectId', projectid,... % Links to an existing project 'ForceNotCompile', 'true'); % Compilation not required result = cell2mat(result); result = 4*sum(result(:,1)) / (result(1,2) * size(result,1)); end The function 'download_pi_results' will download the results. The peach-function will only be used to contact the Techila Server and to request for the results. This means that 'funcname' and 'params' and parameters will not be required here. The 'ProjectId' parameter will specify the Project ID number to which the peach-function call should be linked to. The value of the parameter will be defined by the input argument of the 'download_pi_results' function. Worker Code The code that is executed on the Workers is shown below. function result = mcpi_dist(loops) result = 0; for i = 1:loops if ((sqrt(rand() ^ 2 + rand() ^ 2)) < 1) result = result + 1; end result = [result loops]; end The Worker Code is similar to the basic implementation shown in section 4.1. The only difference is that the result also contains the number of iterations that were performed during the Monte Carlo routine. The number of iterations is stored in order to preserve information that is required in the post-processing. Embedding the variables required in post-processing in the result files means, that the post-processing activities can be performed correctly regardless of when the results are downloaded. Creating the Project The computational Project can be created by executing the following command: projectid = run_detached(10,1000000) ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 66/74 This creates a Project consisting of ten Jobs. After all of the computational data has been transferred to the Techila Server, the Project ID number will be returned to the 'projectid' variable. The Project ID number can be used to download the results of the Project after all the Project has been completed. After the Project has been completed, the results can be downloaded from the Techila Server with the 'download_pi_results' function using the syntax shown below: pivalue = download_pi_results(projectid) ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 4.8 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 67/74 Iterative Projects Using iterative projects is not so much as a feature as it is a technique. Projects that require that use the output values of previous projects as input values can be implemented by placing the peach-function inside a loop structure. This example illustrates this technique by creating several, consecutive projects that perform the Monte Carlo Pi approximation. The iterative process will stop, as soon as the error of the approximated value of PI reaches below a threshold value. The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\Features\iterative_project Local Control Code The Local Control Code used to create several, consecutively created projects is shown below. function run_pi_iteration() threshold = 2e-4; n = 20; loops = 1e7; total_result = 0; iteration = 1; current_error = pi; % % % % % % Maximum allowed error Number of Jobs Number of iterations performed in each Job Initial result, no approximations have been performed. Project counter, first Project will Initial error, no approximations have been performed if exist('restorepoint.mat') load restorepoint end % Is the computations are being resumed? % Load intermediate results techilainit(); % Initialize the Techila environment once while abs(current_error) >= threshold result = sum(cell2mat(peach('mcpi_dist',{'<param>',iteration,loops},1:n,... 'Messages','false',... % Turn off messages 'DoNotInit','true',... % Do not initilize 'DoNotUninit','true'))); % Do not uninitalize total_result = total_result + result; approximated_pi = total_result * 4 / (loops * n * iteration); current_error = approximated_pi - pi; fprintf(1,'Error in the approximation = %.16f\n',current_error); iteration=iteration+1; save restorepoint iteration total_result end techilauninit(); % Uninitialize the Techila environment fprintf('Error below threshold value, no more Projects required.\n') fprintf('Approximated value of Pi = %g\n',approximated_pi) % Print the approximated value of Pi delete('restorepoint.mat') end The peach-function call is placed inside a loop structure, which is implemented with a while statement. The while-clause is true, if the error of the approximated value of Pi exceeds the threshold value. When ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 68/74 the error is greater than the threshold value, a new computational project will be created to improve the accuracy of the approximation. Intermediate results will be saved locally every time the results of the latest project have become available. Worker Code The algorithm for the Worker Code is shown below. function result = mcpi_dist(jobidx,iteration,loops) result = 0; rand('state',jobidx*iteration) for i = 1:loops % Monte Carlo loop from 1 to loops if ((sqrt(rand() ^ 2 + rand() ^ 2)) < 1) % result = result + 1; end end Apart from the random number seeding method, the code is identical to the one used in section 4.1. The random number seeding is done in order to ensure that the Jobs in the Projects generate such results that the error threshold is reached after a reasonable number of Projects. Creating the computation Project The Local Control Code can be executed using command: run_pi_iteration The command shown above will create projects consisting of 10 Jobs. Each Job will consist of 10 million iterations. Projects will be created until the error of the approximated value is smaller than the threshold value. The error of the approximation will be printed every time a Project has been completed. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 4.9 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 69/74 Remote Compiling Compiled MATLAB binaries are platform-dependent. This means that they can be executed only on the same native platform where the compilation was done. Binaries compiled on a Microsoft Windows platform can be executed on native Microsoft Windows Workers only, Linux binaries can be executed on Linux Workers only and so on. Techila Remote compiling makes it possible to compile binaries on Workers. This provides the End-User an access to binaries compiled on all the platforms in his Techila environment, enabling efficient use of non-homogenous computing resources. Before using Remote Compiling, please verify the terms of your MATLAB licence. The material used in this example is located in the following folder in the Techila SDK: techila\examples\Matlab\Features\remote_compiling Please note that this example requires that your Techila Administrator has enabled Remote Compilation in your Techila environment. Remote Compiling in MATLAB The Remote Compiling feature can either be enabled with a peach-function parameter or by calling the Techila MATLAB toolbox function directly. Remote Compiling is enabled with the following peach-function parameter: 'RemoteCompile', 'true' When the Remote Compiling feature is enabled in the peach-function call, a number of compilation Projects are created prior to the actual computational project. During the compilation projects, binaries are compiled for all available platforms. These binaries are then automatically used in the computational Project. The Remote Compiling toolbox function can be called directly with the following command: remotecompile('name_of_m_file') This creates a number of compilation Projects, in which the source code defined in the m-file is compiled in to a binary on the Workers. After the compilation projects are completed, the binaries are transferred back to the End-Users computer. All of additional parameters that can be used in Remote Compilation Projects are viewable directly from MATLAB using command: doc remoteCompile Local Control Code The Local Control Code to create a project using Remote Compiling is shown below. function result = run_remote_pi(jobs, loops) result=peach('mcpi_dist', {loops}, 1:jobs,... 'RemoteCompile', 'true'); result = sum(cell2mat(result))* 4 / (loops * jobs); ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 70/74 end Remote Compiling is enabled with the following parameter pair: 'RemoteCompile', 'true' Using the specified parameters, the compilation is performed locally for the native operating system of the end-users computer. Binaries are compiled remotely for available platforms included in the default settings. Excluding the locally compiled platform, the default settings compile the binary for the following platforms remotely: Default platforms Windows Linux OS X 32 – bit X X - 64 – bit X X X Worker Code The code that will be compiled and executed on the Workers is shown below. function result = mcpi_dist(loops) result = 0; for i = 1:loops if ((sqrt(rand() ^ 2 + rand() ^ 2)) < 1) result = result + 1; end end The code is similar as described in Chapter 4.1. More information on the functionality of the program can be found in Chapter 4.1. Creating the compilation and computation project The Projects can be created using command: pivalue = run_remote_pi(10,1000000) When the Local Control Code will be executed a number of Projects are created that attempt to compile the 'mcpi_dist.m' into an executable binary remotely. A separate Project will be created for all target platforms, excluding the native operating system of the End-User for which the binary is compiled locally. As soon as all binaries have been compiled, the computational Project will be created using the parameters defined in the basic syntax of the peach-function. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 5 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 71/74 Troubleshooting This chapter contains a troubleshooting section, which contains information on some of the most frequently encountered error messages. For more troubleshooting tips, please see the document Techila Fundamentals. Problem: Java Out of Memory errors when running long projects. Solution: Reserved memory for Java in Matlab is too small. This can be fixed by creating a text file named java.opts in the $MATLABROOT/bin/$ARCH directory (e.g. \bin\win32 on Windows) and adding the following two lines: -XX:PermSize=64m -XX:MaxPermSize=256m Problem: Unable to compile/select compiler on 64bit Matlab (Windows) Solution: Install Microsoft Visual Studio, or use 32bit Matlab. More information can be found on the following website: http://www.mathworks.com/support/compilers/R2009b/win64.html Problem: Global variables lost. Solution: Previous versions lost global variables when init was called. Current version does not, but global <var> must still be called after each init / peach call. Problem: Project fails and Workers generate an error message resembling the one shown below:" Failed to create a directory required to extract the CTF file." Solution: The path of some of the files included in the compilation is too long, causing the extraction process to fail on the Worker. To solve this problem, copy the files to a location with a shorter path or use for example the "subst" command to substitute the path. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies 6 6.1 End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 72/74 Appendix Appendix 1: Examples on Cloudfor control parameter definitions This Chapter contains a collection of examples on how to define additional parameters for the 'cloudfor' helper functions. General syntax Example Explanation %cf:stepsperworker=<integer> %cf:stepsperworker=2 Executes two iterations in each Job %cf:estimate=<numeric> %cf:inputparam=<comma separated values> %cf:estimate=60 %cf:inputparam=in1,in2 %cf:outputparam=<comma separated values> %cf:outputparam=out1,out2 %cf:sum=<comma separated values> %cf:sum=var1,var2 %cf:cat=<comma separated values> %cf:cat=var1,var2 %cf:replace=<comma separated values> %cf:replace=var1,var2 %cf:dependency=<comma separated values> %cf:dependency=ev_func,ev_fun c2 %cf:stream=<true|false> %cf:stream='false' %cf:quiet %cf:quiet %cf:parameters=<string> %cf:parameters=['1 1000 output.txt'] Only Workspace variables 'in1' and 'in2' are transferred to Workers. Only Workspace variables 'out1' and 'out2' are returned from Workers. Sum return values 'var1' and 'var2' Concatenate return values 'var1' and 'var2' Replace return values 'var1' and 'var2' with the latest result Defines that functions 'ev_func' and 'ev_func2' will be required during the Job. Disables streaming. Results returned in one package after all Jobs are completed. Disables messages. Defines three input arguments for the precompiled binary: '1','1000' and 'output.txt' ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 %cf:force:loopcount %cf:force:loopcount %cf:force:largedata %cf:force:largedata %cf:callback <string> %cf:callback refreshdata %cf:resultfilevar=<variable> %cf:resultfilevar=filelist %cf:importdatavar=<variable> %cf:importdatavar=out_var %cf:importdata={delimiter, nheaderlines} %cf:importdata={',',1} %cf:datafile=<variable>|<csv> %cf:datafile={{'file1','file2 '}} %cf:jobinputfile=<paramx>|<see peach> %cf:jobinputfile JobFile={{'file1'},{'file2'}} %cf:donotinit %cf:donotinit 73/74 Removes the 100,000 Job safety limit in a Project. Removes the 10 MB Workspace variable limit in a Project. Calls the function 'refreshdata' each time a result has been received. Stores the location of the result files to the 'filelist' variable. Import the result data from the result files to the 'out_var' variable. Defines a comma (,) as the column separator and one header line. Transfer files 'file1' and 'file2' to all participating Workers. Transfers Jobspecific input files for two Jobs. File 'file1' for Job #1 and file 'file2' for Job #2. Files will be renamed to 'JobFile' on Workers. Do not initialize the Techila environment. Techilainit() needs to be called before cloudfor if this is specified. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners. Author Techila Technologies End-User Guide Techila with MATLAB Classification Public Date November 08, 2013 74/74 %cf:donotuninit %cf:donotuninit Do not uninitialize the Techila environment. Typically used with the 'donotinit' parameter. %cf:peach <peachparam>=<peachvalue> Example 1: %cf:peach RemoteCompile='true' Example 1. Enables remote compilation in the Project. Example 2: %cf:peach JobInputFiles={{'input1.mat'} ,{'input2.mat'}} Example 2. Assigns Jobspecific input files for two Jobs. File 'input1.mat' for Job #1 and file 'input2.mat' for Job #2. The files will be renamed to ' jobinputfile1.m at' on the Workers. ©Copyright 2010-2013, Techila Technologies, Ltd. All rights reserved. Techila, Techila Grid, and the Techila logo are either registered trademarks or trademarks of Techila Technologies Ltd in the European Union, in the United States and/or other countries. All other trademarks are the property of their respective owners.