A Sample Based Algorithm to Determine Minimum A. Arun Prakash
Transcription
A Sample Based Algorithm to Determine Minimum A. Arun Prakash
A Sample Based Algorithm to Determine Minimum Robust Cost Path with Correlated Link Travel Times A. Arun Prakash ∗†and Karthik K. Srinivasan ‡ November 18, 2013 Total word count: 7440 + 1000(3 figures + 1 table) = 8440 Submitted to the TRB 93th Annual Meeting, January 10-14, 2014 for presentation and publication ∗ Corresponding Author Transportation Engineering Division, Department of Civil Engineering, IIT Madras, Chennai 600036, INDIA. e-mail: [email protected] ‡ Room No: 235, Transportation Engineering Divsion, Department of Civil Engineering, IIT Madras, Chennai - 600036, INDIA. e-mail: [email protected] † 1 TRB 2014 Annual Meeting Paper revised from original submittal. Prakash and Srinivasan Abstract 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Travel time reliability is an important and a desirable property in route and departure time choice, specifically for a risk averse traveller. Thus, optimizing for reliability has been gaining interest in recent past in various fields of transportation, computer science, stochastic optimization etc. The present work addresses reliability optimization under uncertainty, where travel time distributions are represented by a sample. The weighted mean-standard deviation measure (robust cost) is adopted as a metric of reliability. The minimum robust cost path problem with link travel times following a general correlation structure is addressed. A sampling-based approach which is relatively unused is adopted from the literature to capture and represent spatial correlations. A novel network transformation and pruning procedure is proposed which determines exact solution to the problem and which also circumvents the high dimensionality of the formulations proposed in the literature. Computational experiments presented demonstrate the efficacy of the algorithm where it was found to perform well on real world networks. The impact of sample approximation on the population/true optimal was quantified and was found to be acceptable. Keywords: travel time variability, mean-variance, mean-standard deviation, reliability, correlations, most reliable path 2 TRB 2014 Annual Meeting Paper revised from original submittal. Prakash and Srinivasan 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 1 Introduction Uncertainty in transportation networks can be attributed to both the supply side and demand side characteristics. Supply side factors include capacity variability, signal controls, incidents, accidents, weather etc, and demand side factors are traveler behaviour, spatial and temporal demand variation, special events etc. The stochastic interactions lead to a highly variable system performance and high travel times, especially in peak periods. To mitigate such impacts, there is growing interest towards optimizing performance in uncertain environment recently. In this regard, this paper addresses the minimum robust cost path (MRCP) problem with stochastic link travel times having arbitrary distributions and general correlation structures. In this study, “robust cost of a path” is a measure of reliability that is defined as the weighted average of mean and standard deviation of travel times on the path. This measure is chosen here as the travel times are not restricted to pre-specified commonly used distributions (e.g. normal, log-normal). Further, the weight for standard deviation allows a straightforward representation and interpretation of risk aversion to unreliable travel times . This objective further circumvents the need to evaluate path travel time distributions using high dimensional multivariate integrals. The motivation behind this study is threefold. Firstly, it is widely accepted that users of transportation networks place a high value on reliability while making their travel decisions [1, 18]. Second, the studies which optimize reliability measures under uncertainty make restrictive assumptions of link independence or specific correlation structures. Third, the MRCP problem is difficult to solve for reasons listed below. Empirical evidence suggests that travel times are significantly correlated across links and should not be disregarded [13]. Unfortunately, the quantification of travel time distribution and the estimation of general correlation matrices with empirical data is non-trivial due to the difficulty in estimating valid correlation structures with empirical data, and intractability of estimating path travel time distributions analytically. However, the presence of correlations makes the MRCP problem difficult to solve as several desirable properties such as linearity, link-separability, subpath optimality and subpath non-dominance (in conventional shortest path problems) do not hold. Due to these data issues and algorithmic challenges, practical applications for optimizing reliability-based objectives are not yet sufficiently well-developed for realistic network sizes. In particular, the MRCP problem has not received adequate attention in the literature. Important exceptions include, [21] which proposes exact solution for special correlation cases and [24] which addresses the general correlation case but with a heuristic solution procedure. This study attempts to build upon the advantages of both the studies and proposes a new exact solution procedure for the general correlation case. Specifically, this study addresses three objectives: 1) To propose a new subpath elimination criterion for the MRCP problem. 2) To develop a new and efficient algorithm to solve the MRCP problem based on the subpath elimination criterion. 3) To investigate the computational performance of the proposed algorithm on real world networks To handle correlations, a sampling-based approach with implicit correlation representation is adopted based on the literature [24,25] as as it circumvents correlation quantification issues and leads to increased computational tractability (as explained in Section 3 TRB 2014 Annual Meeting Paper revised from original submittal. Prakash and Srinivasan 16 3.3). However, the solution approaches in this study differ from and contribute to the literature in the following respects: A new “norm based subpath elimination criterion” is developed for the robust cost objective to identify and eliminate subpaths that are sub-optimal. Based on this criterion, a pruning algorithm based on label-correcting is developed for the MRCP problem. The proposed approach leads to significant enhancements in algorithm performance by applying network transformation techniques (Dial’s efficiency, Johnson’s transformation, and localization), and their integration with the sample-based quantification technique. Consequently, the new solution procedure yields a substantial dimensionality reduction from multiple objectives to a two objective problem and an exact solution is guaranteed. In addition, the quality of the solution from the sample approximation formulation is also investigated in contrast to prior studies. The rest of the paper is organized as follows. Section 2 presents a brief review of related literature and identifies gaps in relation to existing study. The problem definition and formulation are discussed in Section 3. The solution procedure is presented in Section 4 along with the implementation details. Computational experiments and results are discussed in Section 5 followed by conclusions in Section 6. 17 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 Literature Review Travel time reliability has received considerable emphasis in the recent literature. Quantification of travel time variability at network level has various challenges including limited sample size, spare spatial-data availability, separation of sources of variability etc. Readers are directed to [2] for a more detailed discussion. In literature, stochastic networks were adopted to appropriately model uncertainty for network optimization. Various objective functions were chosen for optimization including the least expected travel time (LET), mean-variance measures and probability measures of reliability. The LET path problem under en-route guidance was addressed by [6,16,23], where dynamic programming based algorithms with exponential worst case complexity were implemented. The LET path problem in stochastic and time dependent networks was address by [9, 10, 17] using stochastic non-dominance criterion along with label correcting procedures. Though the LET path problem models decision making under uncertainty, it fails to take into consideration the users’ risk-averse behavior. Evidence from multiple studies points to the fact that users value reliability considerably [1, 3, 18]. To model this behaviour [22] introduced a variance constraint into the standard LET problem. The resulting integer programming problem assuming link independence was solved using a branch and bound procedure. [19] adopted a similar formulation and proposed a solution procedure applicable for the correlation structures with cycle covariance property (variance of a path decreases after removal of a cycle), but the performance on large networks was not tested. Other studies have adopted multi-attribute utility functions as objective functions including early and late schedule delays [12, 15] and variability as attributes. [15] showed that this class of problems is NP-hard even for simple quadratic penalty/disutility functions without link correlations. Probabilistic measures (on-time arrival reliability) have also been considered as ob4 TRB 2014 Annual Meeting Paper revised from original submittal. Prakash and Srinivasan 29 jective function [8, 11, 14, 20]. The need to quantify path travel time distributions and correlations restrict these approaches to a few common distributions. Readers are referred to [20] for a more detailed review of studies that optimize probabilistic reliability metrics. To avoid some of the quantification issues noted above, linear function of mean and a measure of variability has also been used as a metric of reliability (as in present study) in the literature. The MRCP problem without correlation can be solved by determining the non-dominated set in objectives mean and variance by exploiting the subpath nondominance property, which was adopted in [4,5]. But, the significance of correlations and their effect on optimality has been demonstrated by [20, 21]. [21] proposed an algorithm and a heuristic for the mean-variance trade-off problem for the case where Cholesky coefficients of the link covariance matrix are positive which is too restrictive and seldom holds in transportation networks. The problem was reformulated as a multiple-objective shortest path problem in m dimensions (where m is the number of links) and solved by determining the non-dominated set. [24] proposed two approximation methods to solve the MRCP problem with and without link correlations. A sample approximation approach was used to model correlations and a Lagrangian substitution based lower bound heuristic was proposed for optimization. An average error of over 5.4% and suboptimality in 25.5% cases was reported in the study. In summary, compared to the deterministic path optimization models, fewer studies aim to optimize reliability related objectives. Among the few studies that focus on such stochastic optimization, link travel time are often assumed to be independent which is restrictive and unrealistic [4] due to data and optimization difficulties.The issues of distribution quantification, correlation estimation and reliability quantification with empirical data impose statistical and computational challenges. Unfortunately, the restrictive assumption of link independence or limited correlations can lead to sub-optimal solutions [20]. Due to these difficulties, practical applications (for reliability-based optimization) on real networks based on empirical data remain elusive. This study tries to address some of these gaps. 30 3 31 3.1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 32 33 34 35 36 37 38 39 Problem Definition and Formulation Context and Scope The transportation network is represented as a directed graph G(N, A) where, N = 1, 2, . . . n represents the set of nodes and A (|A| = m) represents the set of m directed arcs/links. The link travel times are assumed to be random variables that follow a multivariate distribution, t ∼ f (µ, Σ) where, µ is the mean travel time vector and Σ is the link travel time covariance matrix. The present study addresses a static (time invariant) context, and the minimum robust cost path is considered to be an a priori path whereby recourse decisions are not permitted. 5 TRB 2014 Annual Meeting Paper revised from original submittal. 3.2 Minimum Robust Cost Path 1 2 3.2 Prakash and Srinivasan Minimum Robust Cost Path The mean and variance of travel time of path P represented by µP and σP2 are given by X X X X µP = µi and σP2 = σi2 + ρij σi σj (1) i∈P 3 4 5 6 ∀i∈P ∀j6=i∈P i∈P where µi and σi are mean and standard deviation of travel time on link i, ∀i ∈ A and ρij represents the correlation coefficient between the travel times of links i, j ∈ A. The robust cost of path P , RCP , given a weight δ, is defined as a linear combination of mean and standard deviation of the path travel time. RCP = δµP + (1 − δ)σP 7 8 9 10 11 (2) The parameter δ which is between 0 and 1 represents the degree of risk aversion of the user. A low value of δ represents a highly risk averse user, while a high value represents a less risk averse user. The objective of the minimum robust cost path problem is to determine the path P ∗ such that its robust cost RCP ∗ is lower than robust cost RCP of any other path P from origin r to destination s. 12 MRCP: Min. RCP (x) =δ X µuv xuv + (u,v)∈A 0.5 (1 − δ) X 2 σuv xuv + v:(u,v)∈A 13 14 15 16 17 18 19 20 21 22 23 24 xuv − X v:(v,u)∈A ρuv−u0 v0 σuv σu0 v0 xuv xu0 v0 (3a) u=s u=t u ∈ N − {s, t} (3b) (u,v)6=(u0 ,v 0 )∈A (u,v)∈A X X xvu 1 −1 = 0 where the decision variable xuv = 1 if arc (u, v) belongs to optimal path P and 0 otherwise. The flow conservation constraints (3b) ensures that the series of links carrying non-zero flows on the optimal solution constitutes a directed path from the source to the sink. As noted earlier, subpath optimality and subpath non-dominance properties do not hold for the MRCP problem. As an illustration, consider the network shown in Figure 1 where the mean and variances of the link travel times are given on the arcs. Let the correlations between the arcs (1, 3) and (3, 4) be -0.5 and between arcs (2, 3) and (3, 4) be 0.5. Let the weight (δ) be equal to 0.2. Note that for the path 1-3-4 the robust cost is not additive. Also the path 1-3-4 is the optimal path to node 4 from 1 even-though its subpath (1-3) is sub-optimal violating the subpath optimality condition. Also, the subpath 1-3 is dominated on both objectives violating the subpath non-dominance principle. 6 TRB 2014 Annual Meeting Paper revised from original submittal. 3.3 Sample-Approximation Based Reformulation 1,1 3 Path 4 3.5,1 1-3 1-2-3 1-3-4 1-2-3-4 1,0.8 1 Prakash and Srinivasan 2,0 2 (a) Example Network Mean Var. Robust Cost 3.5 3 4.5 4 1 0.8 1 2.7 1.5 1.31 1.7 2.11 (b) Path/Sub-path Costs Figure 1: Failure of Subpath Optimality and Subpath non-dominance (δ=0.2, ρ(2,3)(3,4) = 0.5, ρ(1,3)(3,4) = −0.5) 1 2 3 4 5 6 7 8 9 10 11 3.3 Sample-Approximation Based Reformulation The MRCP problem in equations (3) requires the mean, variance and correlations of link travel times. However, empirical estimation of correlation structure is data intensive and non-trivial for reasons noted previously. To circumvent these difficulties,a sampling based approach is adopted and the objective is accordingly reformulated below Let tdi represent the travel time realization on P day d, on the link i measured over D −1 D days. The sample mean is given as, µ ˆi = D d=1 tdi . Here, the link travel time distributions are assumed to be stationary, that is all measurements tdi come from the same multivariate distribution. We estimate the mean and variance of travel time on path P from the sample by X µ ˆP = µ ˆi (4a) i∈P D D X 1 X 1 X X 2 2 (tdP − µ ˆP ) = tdi − µ ˆi σ ˆP = D − 1 d=1 D − 1 d=1 i∈P i∈P 12 13 14 15 Min. RCP (x) = δ X i∈A 17 18 (4b) Note that path travel time realization on day d, tdP can be expressed as sum of link P travel times on that day tdP = i∈P tdi Using the equations in (4) the robust cost objective in equation (3a) can be reformulated using sample mean and variance instead of the population values as follows: ! 16 !2 µ ˆ i xi + (1 − δ) 1 D−1 D X X d=1 i∈A tdi xi − X !2 0.5 µ ˆ i xi (5) i∈A where x represents the vector of m binary variables xi ∈ {0, 1} which satisfy the flow constraints of unit flow from source r to sink s. The above expression can be written in a concise form as follows, 0.5 1 T T T x C Cx Min. RCP (x) = δ µ ˆ x + (1 − δ) (6) D−1 7 TRB 2014 Annual Meeting Paper revised from original submittal. Prakash and Srinivasan 1 2 3 4 5 6 7 8 9 10 where µ ˆ represents the vector of sample means of all the links and C is a D × m matrix of deviations from the mean whose elements are given as cdi = tdi − µ ˆi . Note that the variance of a path in (5) and (6) is represented as sum of squares of D separate link objectives cdP . Thus the problem has been reformulated as a multi-objective (D + 1 including the mean) problem. Typically multi-day estimates of travel time data are collected by many traffic agencies using probe vehicles, blue-tooth sources, and historical archives from other ITS sensors. These data can be be readily used to optimize robust cost based on the reformulated objective in equation 8 without quantifying distributions or correlations explicitly. 4 Proposed Algorithm for determining MRCP 13 In this section we provide an overview of the algorithm, discuss subpath elimination criterion and proof of correctness followed by the implementation details. The pseudo code of the algorithm is depicted in Figure 2 14 4.1 11 12 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 Overview of Algorithm The input to the algorithm is the O-D pair (r, s) for which minimum robust cost path is sought and the travel time realizations over D days. The algorithm involves two main procedures namely network transformation to reduce the dimensionality of the problem and network pruning to find the optimal path. Network transformation involves two sequential operations: (i) identification of dial efficient links with link costs as sample mean travel times. This step leads to reduced network dimensionality (∼50%) and acyclic sub-network and (ii) Johnson’s transformation is applied only to dial efficient links with cost vector ci (defined as difference between travel time on day d from the sample mean) to obtain reduced cost-vectors wi (defined in (14)). The advantage of this transformation is two-fold: first the transformed costs which are non-negative for dial-efficient links unlike the cost vector ci . Second, a super sink node s∗ and a new link (s, s∗ ) is added to the transformed network and its costs w(s,s∗ ) are set so that all negative cost elements associated with ci are localized on this link. By exploiting this localization of negative costs on to a single arc, a new subpath elimination criterion is proposed for this transformed network with reduced costs (discussed in detail in Section 4.2.1) and it’s correctness is established. This criterion eliminates subpaths that satisfy either of the following conditions: 1. Lower bound on variance of any path containing current subpath exceeds a normbased threshold and, lower bound on robust cost of the path is larger than the corresponding upper bound on optimal path (7a). 2. Lower bound on weighted mean component of robust cost of any path containing current subpath, exceeds upper bound on robust cost of optimal path (7b). 8 TRB 2014 Annual Meeting Paper revised from original submittal. 4.2 Subpath Elimination Criterion and Proof of Correctness Prakash and Srinivasan 3 Finally, based on this subpath elimination criterion, a new pruning procedure is implemented on the transformed network. This pruning procedure eliminates suboptimal subpaths using a modified label correcting approach (as described in Section 4.3.2. 4 4.2 5 4.2.1 1 2 6 7 8 9 10 11 12 13 14 15 16 Subpath Elimination Criterion and Proof of Correctness Description of Subpath Elimination Criterion In this subsection the norm-based subpath elimination criterion is explained. The elimination criterion is dependent on only two link separable objectives thus reducing the dimension from D + 1 objectives in equation (5). Consider an intermediate node u, and a subpath Pr−u , from source r to node u with two attributes dm (u) and dv (u). Here, dm (u) represents total expected travel time of the subpath and dv (u) represents the sum of norm-squares (Σi ||wi ||2 , i ∈ Pr−u , obtained from network transformation) of the transformed cost vectors of the subpath. Let SP m (u) denote the shortest path label (value) from u to sink s with link costs as µ ˆi and, SP v (u) denote the shortest path distance from u to sink s with link costs as ||wi ||2 . Let RCU B be an upper bound on the optimal robust cost. Elimination criterion at node u is given by, 17 Condition 1 [dv (u) + SP v (u)] ≥ ||β||2 AND Condition 2 δ [dm (u) + SP m (u)] + 18 19 (1 − δ) v v 0.5 (d (u) + SP (u)) − ||β|| > RCU B (D − 1)0.5 (7a) OR Condition 3 δ [dm (u) + SP m (u)] > RCU B (7b) 28 where, −β represents the localized cost vector on the link (s, s∗ ) in the transformed network. Condition 1 checks whether the lower bound of variance of the path P with Pr−u as subpath being greater than the variance on the link (s, s∗ ). Condition 2 evaluates if the lower bound of robust cost of the path P with Pr−u as subpath is greater than the upper bound on minimum path robust cost. Condition 3 checks whether the lower bound on weighted mean component of path P with Pr−u as subpath is greater than the upper bound on minimum path robust cost. The correctness of this elimination criterion is established in the next section. 29 4.2.2 20 21 22 23 24 25 26 27 30 31 32 33 34 Proof of Correctness Proposition 1 establishes that if either conditions 1 and 2 (7a) or condition 3 (7b) holds at node u for some subpath Pr−u , that subpath cannot be a part of the optimal path from r to s. Proposition 2 establishes that the optimal robust cost from r to s is one among the paths that remains after eliminating subpaths that satisfy the elimination criterion. 9 TRB 2014 Annual Meeting Paper revised from original submittal. 4.2 Subpath Elimination Criterion and Proof of Correctness 1 2 3 4 5 6 7 8 9 10 11 12 Prakash and Srinivasan Proposition 1. Any path from source r to an intermediate node u, given by Pr−u , which satisfies condition (7) cannot be a subpath of the minimum robust cost path from r to sink s∗ on the transformed network. Proof. The proposition is proved by demonstrating that LHS of the second inequality in (7a) and LHS in inequality (7b) are lower bounds on the robust cost of any path P from source r to sink s∗ which includes Pr−u as a subpath where u 6= s∗ . Consequently, if the lower bound on robust cost of any path P with Pr−u as a subpath is greater than the upper bound on optimal path robust cost, then the subpath Pr−u cannot be contained in the optimal path. First, the validity of (7a) is demonstrated. Let the first inequality in (7a) be true. The variance of a path P − s∗ on transformed network from source r to sink s∗ with Pr−v as subpath from r to node u is given as, (D − 1)V arP −s∗ =|| X wi − β||2 i∈P ≥ || X wi ||2 + ||β||2 − 2|| i∈P X wi ||||β|| i∈P !2 || = X wi || − ||β|| (8) i∈P 13 14 15 16 17 In (8), the inequality follows from norm identity and Cauchy-Schwarz inequality. Note that all elements of transformed cost-vector are non-negative as a result of transformation wi 0 (14). Assuming condition 1 holds and using the fact that Pr−u is the subpath of P − s∗ and as we have assumed first inequality in (7a) to be true, we can write, X X || wi ||2 ≥ ||wi ||2 (9a) i∈P X ||wi ||2 ≥ i∈P 18 19 i∈P X ||wi ||2 + SP v (u) ≥ ||β||2 (9b) i∈Pr−u where SP v (u) denotes the shortest path distance from u to s with link costs as ||wi ||2 . Applying the inequalities (9) onto last expression on (8) we can write. !2 || X wi || − ||β|| i∈P ≥ 2 !1/2 X ||wi ||2 X − ||β|| ≥ ||wi ||2 + SP v (u) i∈P 2 1/2 − ||β|| i∈Pr−u (10) 20 From equations (8) and (10) and given (9) we can write, 10 TRB 2014 Annual Meeting Paper revised from original submittal. 4.3 Implementation of the Algorithm Prakash and Srinivasan 1/2 X (D − 1)V arP −s∗ ≥ ||wi ||2 + SP v (u) 2 − ||β|| (11) i∈Pr−u 1 2 3 Thus we have obtained a lower bound on travel time variance of any path P − s∗ with Pr−u as its subpath. Similarly the lower bound on mean travel time of P − s∗ can be obtained as follows, X µP −s∗ ≥ µ ˆi + SP m (u) (12) i∈Pr−u 4 5 where SP m (u) denote the shortest path distance from u to s with link costs as µ ˆi . Using (12) and (11) we can write RCP −s∗ ≥ δ X i∈Pr−u µ ˆi + SP m (u) + 1/2 (1 − δ) X ||wi ||2 + SP v (u) 0.5 (D − 1) i∈P − ||β|| r−u (13) 6 7 8 9 10 Thus, the LHS of second inequality in expression (7a) is lower bound on robust cost of any path P − s∗ with Pr−u as subpath. The LHS of inequality in expression (7b) is a lower bound on mean travel time of any path P − s∗ with Pr−u as subpath. As variance of random variable is always non-negative, the LHS of expression (7b) is also a lower bound on robust cost of any path P − s∗ with Pr−u as subpath. 11 12 13 14 15 16 17 18 Proposition 2. The minimum robust cost path P ∗ between source r and sink s will be present in the path-set at sink at the termination of pruning procedure. Proof. This claim is proved by contradiction. Let us assume that the minimum robust cost path P ∗ is not present in the path-set at the sink s. This implies that anyone of ∗ it’s subpaths, Pr−u , was discarded as it satisfied the elimination criterion in (7), which contradicts Proposition 1. Thus, the minimum robust cost path P ∗ will be present in the path-set at sink after the termination of pruning procedure. 19 20 21 22 23 4.3 Implementation of the Algorithm The steps of the algorithm presented in Figure 2 are discussed below. An illustrative example is hosted at webpage http://115.115.108.126/trrhost/mrcp_example.pdf and is not included here due to space limitations. 11 TRB 2014 Annual Meeting Paper revised from original submittal. 4.3 Implementation of the Algorithm Prakash and Srinivasan Step 0. Inputs: (i). Network G(N, A), origin r and destination s (ii). Travel time realizations , tdi where d ∈ D and i ∈ A Step 1. Network Transformation: To transform cost-vectors to component wise positive (i). Dial Efficient Links: Perform single sink shortest path algorithm with mean travel times (ˆ µi ) as link costs. Determine set of links which are Dial efficient, A0 i ≡ (u, v) ∈ A0 if d(u) − d(v) > 0 where d(u) represent shortest path distance from u to sink s. (ii). Johnson’s Transformation. Perform Johnson’s transformation for each of day d ∈ D. For each d ∈ D, (a). Compute the shortest path labels, ld (v), v ∈ N with cdi = tdi − µ ˆi , ∀i ∈ A0 as costs. (b). Compute the transformed costs wdi = cdi + ld (u) − ld (v), ∀i ≡ (u, v) ∈ A0 βd = −wd,(t,t∗ ) = ld (s) − ld (t) This results in a transformed network G2 (V, E) where V = N ∪ t∗ and E = A0 ∪ (t, t∗ ). Step 2. Network Pruning to eliminate subpaths which are sub-optimal Perform procedure outlined in Figure 3. Step 3. Finding optimal Robust Cost Path: Identify the minimum robust cost path from the path (label) set at sink t by calculating the robust cost of all the paths in it. Figure 2: MRCP algorithm - Correlated Links 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 4.3.1 Network transformation Network transformations involve two operations 1) eliminating the Dial inefficient links and 2) transforming the original cost-vectors into reduced costs by applying Johnson’s transformation. Identifying Dial efficient links: In this step, links which do not satisfy a criterion similar to Dial’s criterion used in STOCH algorithm are eliminated [7]. A weaker condition is employed in the present study where the links which take the user farther from the given destination in average time are eliminated. These links are found by solving for single destination shortest path problem from all sources to sink s with mean travel times from the sample as link costs. A link (u, v) is dial efficient only if its tail node is closer to the sink than the head node (i.e. optimal distance labels d(u) − d(v) > 0 ). There are two advantages from this transformation: the network reduces by 40-50% (number of arcs) and also the resulting sub-network is acyclic. Johnson’s Transformation to obtain Reduced Costs: First, the Dial efficient sub-network above is transformed into a network G1 as follows: 12 TRB 2014 Annual Meeting Paper revised from original submittal. 4.3 Implementation of the Algorithm Prakash and Srinivasan 1 1. A new super source node r∗ is added 2 2. Dummy arcs connecting the node r∗ to all nodes are created with zero cost. 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 3. Transformed costs ci are defined for the dial efficient links (referred to as real links) as follows: ci = [c1i , c2i , · · · , cDi ] where cdi = tdi − µ ˆi represents the difference between observed link travel time on day d, tdi and sample mean on link i, µ ˆi over all days. Note that these costs cdi will be positive on some days and negative on others. Therefore, the associated path costs and optimal costs on paths with only real links may either be positive or negative for a given day d. For the cost vector ci , note that the path from r∗ to any node v in network G1 consists of real paths (paths containing only dial efficient links) and the virtual path with dummy arc (r∗ , v). For a given day d, if the minimum path cost on real paths is positive, then the virtual path (r∗ , v) is optimal as it has a distance label of zero (by construction). On the other hand, if the minimum path cost on real paths is negative, then the optimal distance label to node v will be negative. Thus, the optimal distance labels (based on cost vector ci ) to any node will either be zero or negative. The Johnson’s transformation is applied on this transformed network G1 to convert the costs, ci , into reduced costs, wi , which are non-negative. More formally, as per the Johnson’s transformation, the reduced cost on a given arc i ≡ (u, v) on day d is given as, wdi = cdi + πd (v) − πd (u) 19 20 21 22 23 where node potential for node u on day d is given as πd (u) = −ld (u) where ld (u) is the shortest path cost from r∗ to u with cdi as link costs. The resulting reduced costs on all arcs in G1 can be shown to be non-negative for each day d. To see this, note that as per the shortest path optimality principle, the optimal distance labels satisfy the following inequality for all arcs i ≡ (u, v) in network G1 ld (v) ≤ cdi + ld (u) 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 (14) (15) Substituting πd (u) = −ld (u) and πd (v) = −ld (v) and rearranging (15) yields wdi = cdi + πd (v) − πd (u) ≥ 0 Next, a new super sink node s∗ is added to the network G1 and a dummy arc (s, s∗ ) between the sink s and super sink s∗ is created. This augmented network is referred to as G2 (V, E) = G1 ∪ (s, s∗ ). The reduced cost on this dummy link on day d is set as: wd(s,s∗) = ld (s) − ld (r) and is denoted as −βd for ease of notation. This term may be either positive or negative. Thus, the reduced cost transformation ensures that all the negative cost elements are localized to arc (s, s∗) in network G2 . Next, it is shown that the path cost on day d from r to s on network G(N, A) based on cost vector ci , is the same as path cost from r to s∗ on the transformed network G2 with the reduced costs wi . Consider a path P from r to s in the network G and the corresponding path P 0 = P P ∪ (s, s∗). The original path cost on P for day d is given as cdP = i∈P cdi The reduced path cost on P 0 for same day is given as, 13 TRB 2014 Annual Meeting Paper revised from original submittal. 4.3 Implementation of the Algorithm wdP ∪(s,s∗ ) = X i∈P = X wdi − βd = Prakash and Srinivasan X [cdi + ld (u) − ld (v)] − βd i≡(u,v)∈P cdi + ld (s) − ld (t) − βd = i∈P X cdi (16) i∈P 4 Due to this equivalence between path costs on network G from r to s, and reduced costs on network G2 from r to s∗ , the optimization and subpath elimination criterion are developed using network G2 and reduced costs wi where all negative cost elements are localized on to the link (s, s∗ ). 5 4.3.2 1 2 3 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Network Pruning The transformed network G2 is pruned using a label correcting based approach and the “subpath elimination criterion” (refer section 4.2.1). The first step in pruning is precomputation of the following quantities. First, we obtain SP m (u) and SP v (u), ∀u ∈ V except s∗ . SP m (u) is the minimum expected travel time from node u to s, and SP v (u) is the shortest path cost from u to s where, cost on links i ∈ E are the norm-square (||wi ||2 ) of the transformed cost-vectors. These are obtained by solving single sink shortest path problem using label correcting procedure, with corresponding costs on the links. Next, the upper bound on optimal robust cost, RCU B , is computed as the robust cost of least expected travel time (LET) path. After precomputation, the network G2 is pruned based on the norm based subpath elimination criterion given in (7). As shown in Proposition 2 the pruning procedure returns a set of candidate paths at the sink node s that is guaranteed to contain the optimal path. A label correcting approach is proposed for pruning which is depicted in Figure 3 and explained below. A scan eligible list is maintained consisting of only those nodes whose subpaths are not discarded by the elimination criterion. The nodes in the scan eligible list are processed sequentially. Specifically, for each node u in scan-eligible list, each of the outgoing arcs (u, v) is processed to obtain the labels for mean and variance for connected downstream v node v. Specifically dk (u) = [dm k (u), dk (u)] is maintained for each node u ∈ V . For v subpath k, dm represent the mean travel time and sum of link transformed k (u) and dk (u) P cost-vector’s norm-squares ( i∈k ||wi ||2 ) respectively. Node v is added to the scan-eligible list if atleast one of the new (not stored at v) subpaths obtained by adding link (u, v) to subpaths at u, do not satisfy the elimination criteria. The algorithm is terminated when the scan-eligible list is empty. Note that unlike in the standard label-correcting approaches, multiple labels and predecessors may be maintained for each node. The optimal path can be shown to be contained in the set of paths that remain at termination (Proposition 2). Therefore, once the label correcting algorithm terminates, the optimal path is identified by computing the robust cost of all non-pruned paths to the destination. 14 TRB 2014 Annual Meeting Paper revised from original submittal. 4.3 Implementation of the Algorithm Prakash and Srinivasan Step 0. Inputs: (a). Transformed Network G2 (V, E) with source r and sink s (b). Link attributes which include sample mean of travel times µˆi and transformed cost-vectors wi ∀i ∈ E Step 1. Pre-computation.. (a). Compute the shortest path distances SP m (u) with µ ˆi as link costs SP v (u) with ||wi ||2 as link costs from every node u ∈ V to s, where i ∈ E. (b). Let RCU B = RCPLET , where PLET is the least expected travel time path from r to s. Step 2. Initialization. (a). Set first label of source r to zero for both the objectives v dm 1 (r) = d1 (r) = 0 Initialize the set of labels at remaining nodes as null sets D(u) = ∅ ∀ u ∈ N \ {r} (b). Add source node r to candidate node list L = {r} Step 3. Node Selection. IF set L is empty (L = ∅), GOTO Step 5. ELSE Select a node u ∈ L. Step 4. Node Processing. For all nodes v such that (u, v) ≡ i ∈ E , do the following: ˜ (a). Compute Set D(v) For every label k of node u, S Compute the temporary label d˜k (v) = dk (u) (u, v) to node v obtained by adding arc (u, v) to label dk (u) m ˆi d˜m k (v) = dk (u) + µ v v ˜ dk (v) = dk (u) + ||wi ||2 (b). Update labels of D(v) / D(v) check Subpath Elimination Criterion. For each acyclic temporary label, d˜k (v) ∈ IF NOT (7a) OR (7b) for label d˜k (v) S THEN set D(v) = D(v) {d˜k (v)} Step 5. Update List (a). IF set of labels of node v, D(v) has been modified and node v ∈ /L L=L∪v (b). Delete node u from list: L = L\u (c). Go to Step 2 Step 6. Return. Return the path (label) set D(s) at sink s. Figure 3: Label Correcting Based Pruning 15 TRB 2014 Annual Meeting Paper revised from original submittal. 4.4 Computational Complexity 1 4.4 Prakash and Srinivasan Computational Complexity 8 The main steps involve: dial efficient link identification (O[mn]), network transformation ( O[mnD + mD], upper bound computation (O[m + nlogn]), pruning procedure (O[mnκ] and robust cost evaluation of path set at termination (O[κ])). Here κ is the maximum size of non-pruned path set size from origin to some non-pruned node. The number of arcs, nodes, and days in analysis are denoted by m, n and D respectively. This gives an overall complexity O[mn(D + 1) + mD + (m + nlogn) + mnκ + κ] which is pseudo-polynomial in pathset size κ. 9 5 2 3 4 5 6 7 Computational Experiments 14 Section 5.1 quantifies the performance of the proposed algorithm on real transportation networks followed by analysis of sample approximation on optimal solution quality (Section 5.2) The experiments were conducted on a Windows-7 (32-bit) computer with two 2.9 GHz CPUs (Core-2 Duo) and a 3GB RAM. 15 5.1 10 11 12 13 16 17 18 19 20 Performance on Real World Networks The performance of the pruning algorithm is studied on real world transportation networks obtained from the website hosted by Bar-Gera [http://www.bgu.ac.il/∼bargera/tntp/]. The link travel time realizations over D days which are assumed to follow a multivariate shifted lognormal distribution. The univariate shifted lognormal random variable used for generating these travel times is given by ti = γi + exp(µi + σi zi ) 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 (17) where zi ∼ N (0, 1) is a standard normal random variable and γi represents free-flow travel time. For the current study, the shifted lognormal parameters for each link on the network were generated uniformly in the following ranges (65-125 sec/km) for γ, (20-200 sec/km) for excess mean (E[ti ] − γi ) and (0.05-0.75) for coefficient of variation. These values were obtained based on data from the Chennai Comprehensive Traffic Study (CCTS, 2008). The correlation matrix was generated uniformly for each network from the space of positive definite matrices. Then, the travel time realizations were generated on all the links for D = 100 days. On each network, the algorithm was applied to find MRCPs for 25 randomly generated OD pairs. The performance metrics for analysis include average computation time and average non-pruned path-set size (at intermediate nodes as well as the destination). The results from these computational experiments are presented in Table 1(a). The computational times were found to be less than 5 secs for 98% of OD pairs on moderate to small networks suggesting acceptable performance for moderately sized ( < 1500 nodes) networks. On the large network of Austin, the average computational time is around 1.2 minutes which is still acceptable for a stand alone application. 16 TRB 2014 Annual Meeting Paper revised from original submittal. 5.1 Performance on Real World Networks Prakash and Srinivasan (a) Performance on Real World Networks NETWORK Name n Comp Times(sec) m Anahiem 416 914 Chicago Sketch 933 2950 Barcelona 1020 2522 Winnipeg 1052 2836 Chennai 1279 3644 Austin 7388 18961 Avg 90th % Avg 0.02 0.06 30.51 1.41 0.26 0.64 2.17 2.64 55.32 5.25 152.08 404.64 7.82 53.53 22.40 30.57 88.38 218.71 Std dev 0.01 6.26 0.12 0.67 11.63 72.94 Path-set size at Node Path-set size at sink 90th % Avg 12.54 35.29 161.94 138.55 39.94 77.24 67.19 114.44 294.11 180.01 403.48 921.77 45.64 928.12 146.84 307.32 1011.88 3259.40 Std dev Std dev 90th % 92.11 246.40 3748.07 1207.80 257.06 569.60 732.57 1476.80 3766.06 1804.00 6703.75 13999.20 (b) Quantification of sampling errors (in %) across ODs Path lengths Sample Size < 10km 5 10 15 20 25 Mean 1.99 1.56 1.33 1.04 0.87 Stdev 4.28 3.58 2.97 2.24 1.93 Mean 5.05 3.15 2.46 1.73 1.24 Stdev 23.44 18.81 15.62 13.90 12.50 Mean 1.40 1.17 1.34 1.38 1.64 Stdev 2.56 2.31 2.51 2.58 2.72 5 6.16 6.87 2.11 12.58 4.35 4.10 10 15 20 25 4.13 2.82 2.19 1.84 5.31 4.02 3.05 2.75 0.83 0.64 0.79 0.59 8.97 7.14 6.35 5.49 3.32 2.72 2.30 1.90 3.62 3.35 3.01 2.75 5 10 15 20 25 8.15 4.99 3.63 3.03 2.49 6.67 4.66 3.66 3.22 2.69 1.20 0.66 0.51 0.29 0.28 10.22 7.19 5.82 4.99 4.37 3.91 2.94 2.38 2.15 1.85 2.99 2.49 2.16 2.05 1.91 10-20km >20km Bias Var error Total Error Table 1: Computational Experiments 1 2 3 4 5 6 7 8 9 10 11 However, the path-set sizes may be higher for a few OD pairs, especially on longer paths. Network density (arc to node ratio) significantly effects the algorithm’s performance as seen from the results on Chicago, Barcelona and Winnipeg networks. Further computational experiments suggested a weak influence of network structure on computational times and non-pruned path set size (variables such as assortativity, betweenness centrality, closeness centrality, and pathsize were found to be influential). Due to space limitations and the relatively weak influence, these are not reported in the paper. In contrast, other methods in the literature namely PIND procedure [21] and Lagrangian relaxation [24] do not guarantee the optimal solution. Lagrangian Relaxation reported an average error of over 5.4% and suboptimality in 25.5% cases though it is likely to be faster. 17 TRB 2014 Annual Meeting Paper revised from original submittal. 5.2 Analysis of Sample Approximation on Solution Quality 1 5.2 Prakash and Srinivasan Analysis of Sample Approximation on Solution Quality 34 The travel time distributions being approximated through a sample rises questions about bias and precision of the solution obtained from sample relative to the population. The extent of the error is quantified as described below. A second set of computational experiments are performed on a real network of Chennai, India with 1279 nodes and 3644 links. The empirically observed travel times on 94 links over a period of 100 days (from Nov 2012 to Feb 2013) were used in this set. On the other links on the networks, independent travel times were generated from link-type wise marginal distributions obtained from CCTS, 2008 which follow shifted lognormal distribution. These travel time realizations (over 100 days) over the complete network are assumed to constitute the population in the current experiment. Then the sample-based MRCP algorithm was applied by varying the sample size systematically from 5 to 25 days (in steps of 5). For each sample size, the MRCP algorithm is applied with 100 draws. This was done for 15 OD pairs in groups of five selected on basis of their shortest path distances of less than 10km, between 10-20km and over 20km. Let Ps∗ and Pp∗ be the optimal paths in the sample and the population respectively. To quantify the quality of solution with the sample-based objective, the metrics measured were bias (difference in robust cost between Ps∗ in sample and Pp∗ in sample), varianceerror (difference in robust cost between Pp∗ in sample and Pp∗ in population) and total error (difference in robust cost between Ps∗ in population and Pp∗ in population). In Table 1(b) we present the percentage error values averaged over the OD pairs of similar path length. As expected, as the sample size increases both the mean and standard deviation of all the errors decrease. The path lengths seems to influence both the bias and variance components, and hence the total error. The OD pairs with shorter path lengths have lower sample errors and total errors as the population optimal is more likely to be found in the sample. The mean total errors across samples for shorter paths lie in the range 1.17% - 1.64%. In contrast, the errors 1.85% - 4.35% were observed for the longer paths and may be explained by smaller size of the non-pruned path-set. The OD pairs with longer paths have lower variance error (0.28% - 1.20%) compared to shorter paths (1.24% - 4.28%) possibly reflecting a greater tendency to regress towards the mean in the former case. More importantly, the quality of the solution for the sample-based robust cost objective is found to be acceptable (1-2%) even with modest sample sizes in most cases. 35 6 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 36 37 38 39 40 41 Conclusion This paper presents a sample based algorithm for minimum robust cost path on a network with link travel time correlations. The problem was formulated as a separable multiobjective problem where a sample based approach was adopted to represent the link travel time distributions. and reformulate the objective function. The sample based approach adopted was found to implicitly capture the path correlations thus eliminating the need to explicitly estimate link travel time correlation matrix which is practically not 18 TRB 2014 Annual Meeting Paper revised from original submittal. REFERENCES 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Prakash and Srinivasan trivial both conceptually and computationally. For this objective, solution procedures based on sub-path optimality and non-dominance are shown to be inapplicable A new subpath elimination criterion was developed to eliminate suboptimal subpaths. This criterion is applied to a suitably transformed network with reduced costs and a pruning algorithm is developed to discard sub-optimal subpaths for the robust cost objective. The transformations together with the elimination criterion enable the conversion of the MRCP problem from a D + 1 dimensional problem to a two objective problem leading to significant computational efficiency gain. The proposed algorithm is shown to yield an exact solution, and its computational complexity is shown to be pseudo-polynomial in terms of the number of non-pruned paths. The approach could also be extended with repetitive application to multi-day and quasi dynamic scenarios The empirical findings from computational experiments performed led to following insights, 1. The algorithm is guaranteed to find the optimal and performs reasonably well on small to medium (< 1500 nodes) sized networks with computational time being less than 5 sec for 98% OD pairs. On large networks, the computational times are higher, particularly for OD pairs with substantial spatial separation. In such cases, the number of non-pruned paths increases considerably. 2. The average error in optimal path due to sampling approximation was found to be less than 5 % for various sample sizes. The OD pairs with shorter path lengths have smaller bias than those with longer path lengths. In contrast, the variance component of such O-D pairs was found to be larger. The total error was found to increase with increasing path lengths, and may be potentially improved by applying the algorithm repeatedly based on real-time traffic state updates and re-optimization 28 The directions for future research include the transformation of costs-vectors into positive components while negative cycles exist. Extending this present algorithm to time-dependent networks, online routing, and non-stationary distributions offer interesting and valuable scope for further investigations. 29 Acknowledgments 25 26 27 33 This research is supported in part by the Center of Excellence in Urban Transport funded by Ministry of Urban Development, Government of India. This support is gratefully acknowledged. The authors would also like to thank Ravi Seshadri for his inputs in the discussions on ideas in the present study. 34 References 30 31 32 35 36 [1] Bates, J., Polak, J., Jones, P., and Cook, A. The valuation of reliability for personal travel. Transportation Research Part E 37 (2001), 191–229. 19 TRB 2014 Annual Meeting Paper revised from original submittal. REFERENCES 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 Prakash and Srinivasan [2] Boyles, S. D. Operational, supply-side uncertainty in transportation networks: causes, effects, and mitigation strategies. PhD thesis, University of Texas, Austin, 2009. [3] Carrion, C., and Levinson, D. Value of travel time reliability: A review of current evidence. Transportation research part A: policy and practice 46, 4 (2012), 720–741. [4] Chen, A., and Zhou, Z. The α-reliable mean-excess traffic equilibrium model with stochastic travel times. Transportation Research Part B: Methodological 44, 4 (2010), 493–513. [5] Chen, B. Y., Lam, W. H., Sumalee, A., and Li, Z.-l. Reliable shortest path finding in stochastic networks with spatial correlated link travel times. International Journal of Geographical Information Science 26, 2 (2012), 365–386. [6] Cheung, R. K. Iterative methods for dynamic stochastic shortest path problems. Naval Research Logistics 45 (1998), 768–789. [7] Dial, R. A probabilistic multipath traffic assignment model which obviates path enumeration. Transportation Research/UK/ 5 (1971). [8] Frank, H. Shortest paths in probabilistic graphs. Operations Research 17, 4 (1969), 583–599. [9] Hall, R. The fastest path through a network with random time-dependent travel times. Transportation science 20, 3 (1986), 182–188. [10] Miller-Hooks, E., and Mahmassani, H. Path comparisons for a priori and time-adaptive decisions in stochastic, time-varying networks. European Journal of Operational Research 146, 1 (2003), 67–82. [11] Mirchandani, P. Shortest distance and reliability of probabilistic networks. Computers & Operations Research 3, 4 (1976), 347–355. [12] Mirchandani, P. B., and Soroush, H. Optimal paths in probabilistic networks: a case with temporary preferences. Computers & operations research 12, 4 (1985), 365–381. [13] Nicholson, A. Estimating travel time reliability: Can we safely ignore correlation. In Delft University (2012). [14] Nie, Y., and Wu, X. Shortest path problem considering on-time arrival probability. Transportation Research Part B: Methodological 43, 6 (2009), 597–613. [15] Nikolova, E., Brand, M., and Karger, D. R. Optimal route planning under uncertainty. In ICAPS (2006), vol. 6, pp. 131–141. [16] Polychronopoulos, G. H., and Tsitsiklis, J. N. Stochastic shortest path problems with recourse. Networks 27 (1996), 133–143. 20 TRB 2014 Annual Meeting Paper revised from original submittal. REFERENCES 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Prakash and Srinivasan [17] Pretolani, D. A directed hypergraph model for random time dependent shortest paths. European Journal of Operational Research 123, 2 (2000), 315–324. [18] Recker, W., Chung, Y., Park, J., Wang, L., Chen, A., Ji, Z., Liu, H., Horrocks, M., and Oh, J. S. Considering risk-taking behavior in travel time reliability. Tech. Rep. UCB-ITS-PRR-2005-3, California PATH Research Report, 2005. [19] Sen, S., Pillai, R., Joshi, S., and Rathi, A. A mean-variance model for route guidance in advanced traveler information systems. Transportation Science 35, 1 (2001), 37–49. [20] Seshadri, R., and Srinivasan, K. Algorithm for determining most reliable travel time path on network with normally distributed and correlated link travel times. Transportation Research Record: Journal of the Transportation Research Board 2196, -1 (2010), 83–92. [21] Seshadri, R., and Srinivasan, K. An algorithm for the minimum robust cost path on networks with random and correlated link travel times. Network Reliability in Practice (2012), 171–208. [22] Sivakumar, R., and Batta, R. The variance-constrained shortest path problem. Transportation Science 28, 4 (1994), 309–316. [23] Waller, S. T., and Ziliaskopoulos, A. K. On the online shortest path problem with limited arc cost dependencies. Networks 40, 4 (2002), 216–227. [24] Xing, T., and Zhou, X. Finding the most reliable path with and without link travel time correlation: A lagrangian substitution based approach. Transportation Research Part B: Methodological 45, 10 (2011), 1660–1679. [25] Zhou, X., and Xing, T. Reformulation and solution algorithms for absolute and percentile robust shortest path problems. Intelligent Transportation Systems, IEEE Transactions on 14,2 (2013), 943–954. 21 TRB 2014 Annual Meeting Paper revised from original submittal.