apr_02_pal 5613KB May 23 2014 06:15:14 PM
Transcription
apr_02_pal 5613KB May 23 2014 06:15:14 PM
Davis Social Links S. Felix Wu Computer Science Department University of California, Davis [email protected] http://www.cs.ucdavis.edu/~wu/ 1 Urgent! Please contact me! FROM:MR.CHEUNG PUI Hang Seng Bank Ltd Sai Wan Ho Branch 171 Shaukiwan Road Hong Kong. Please contact me on my personal box [[email protected]] Let me start by introducing myself. I am Mr. Cheung Pui, director of operations of the Hang Seng Bank Ltd,Sai Wan Ho Branch. I have a obscured business suggestion for you. Before the U.S and Iraqi war our client Major Fadi Basem who was with the Iraqi forces and also business man made a numbered fixed deposit for 18 calendar months, with a value of Twenty Four millions Five Hundred Thousand United State Dollars only in my branch. Upon maturity several notice was sent to him,… 2 1 http://www.ebolamonkeyman.com/cheung.htm 3 Pick your favor Spam Filter(s) 4 2 Antispam Filters • An arm-racing game – In some sense, “the venders” are doing reasonable well. • Bind the “spams” to the “sources” – IP addresses or Source Email addresses – Spammers control a large number of bots – Collaboration between yahoo.com and gmail.com 5 This was considered a spam! 6 3 This was considered a spam! Sometimes, the cost of False Positive may be very high… 7 The Implication of FP’s • Spam-filters have to be conservative… • We will have some false negatives in our own inboxes. • We will use our own time to further filter.. – For me, 1~2 seconds per email 8 4 You have about 1 second to decide…… 9 10 5 11 12 6 “Social Spams” • They might not be spams as we often overlooked the social values of them! 13 Motivations • What is the fundamental issue of “spams”? – Is it something to do with the design of our “basic communication mechanism”? • Why can’t we explicitly utilize the “social context” in our communication? 14 7 Davis Social Links • What is the fundamental issue of “spams”? – Is it something to do with the design of our “basic communication mechanism”? • Why can we explicitly utilize the “social context”? • Routable identity versus receiver control • Trust & Reputation system in “L3” 15 Communicate: [A, D] B A D C As long as “A” knows “D’s routable identity” 16 8 Hijackable Routable Identify 17 [A,D] + social context B A D C “A” has to explicitly declare if there is any social context under this communication activity with “D”! 18 9 The same message content • “M” from Cheung Pui • “M” from Cheung Pui via IETF mailing list • “M” from Cheung Pui via Karl Levitt 19 Social Context • “M” from Cheung Pui Probably a spam • “M” from Cheung Pui via IETF mailing list Probably not interesting • “M” from Cheung Pui via Karl Levitt Better be more serious… 20 10 Social Context • “M” from Cheung Pui Probably a spam • “M” from Cheung Pui via IETF mailing list Probably not interesting • “M” from Cheung Pui via Karl Levitt Better be more serious… Either “M” is important, or Karl’s machine has been subverted! 21 [A,D] + social context ?? B A D C “A” has to explicitly declare if there is any social context under this communication activity with “D”! But, “D” only cares if it is from “C” or not! 22 11 Online Social Network • What is an online social network? – Realize and represent the human social networks “explicitly” (from “somewhat vague, fuzzy and implicit”) – Promote “OSN Applications” – Utilizing the “online” perspective to further develop the human social network • Representation, Application, Development 23 24 12 25 Who is Salma? 26 13 Who is Salma? 27 Who is Salma? 28 14 My message to Salma 29 The Social Path(s) 30 15 More Examples 31 CyrusDSL • How do we accomplish these features? • How do we realize the concepts scaleable? • How will this system work against spams? 32 16 Just a couple issues … • How to establish the social route? – How would “A” know about “D” (or “D’s identity”) ? • How to maintain this “reputation network”? – MessageReaper: A Feed-back Trust Control System (Spear/Lang/Lu) 33 Social network analytical models • Network Mathematics – Random graph model (low diameter) • Newman/Watts/Strogatz, 2002 – Small world model (high cluster coefficient) • Watts/Strogatz, 1998 – Scale-free network (node degree distribution) • Barabasi/Albert, 1999 • What is the right model for the network? 34 17 [A,D] + social context ?? B D A C “A” has to explicitly declare if there is any social context under this communication activity with “D”! But, “D” only cares if it is from “C” or not! 35 Search on “OSN” • How to get to from ? • The Small world model – 6 degree separation (Milgram, 1967) – “existence of a short path” – How to find the short path? (Kleinberg, 2000) 36 18 Routing in a Small World • Common question: do short paths exist? • Algorithmic question: assuming short paths exist. How do people find them? 37 Kleinberg’s Model • Kleinberg’s model: – People points on a two dimensional grid. – “P” Grid edges (short range). – “Q” long range contacts chosen with the inverse rth-power distribution. – How to search? • [S, T] • Find the neighbor closest to T – Work well only when r=2, p=q=1 38 19 Kleinberg’s Model • Use only Local information, except the distance to the target. – However, what is the “global distance” in cyber space? Yet, the assumption behind is that the “edges” depend on the “relative distance”. 39 X, Y, and Z • How will we tell whether the relative distance between X&Y is closer than X&Z? – X, Y, Z (assuming they are all direct friends to each other) • One simple idea: “Keyword intersection” – KW(X), KW(Y), KW(Z) – 1/(#[KW(a) I KW(b)] + 1) – Will this work? How about global distance? 40 20 Similarity 41 Similarity 42 21 Kleinberg’s model • Inherently assume “routable identity” – You have to know the Target identity, and you also need to know the distance metric. – And, then the search algorithm will get to it probabilistically. – The sender/receiver interface is very simple. 43 Social Route Discovery for A2D ?? B A D C Let’s assume A doesn’t have D’s “routable identity” Or, “D” doesn’t have a global unique identity! Then, how can we do A2D? 44 22 Finding ?? B A D C A2D, while D is McDonald’s! D would like “customers” to find the right route. “idea: keyword propagation” e.g., “McDonald’s” 45 Announcing B D K: “McDonald’s” A C Hop-by-hop keyword propagation 46 23 Announcing B D K: “McDonald’s” K: “McDonald’s” A C Hop-by-hop keyword propagation 47 Announcing B D K: “McDonald’s” K: “McDonald’s” K: “McDonald’s” A C Hop-by-hop keyword propagation 48 24 Announcing B D K: “McDonald’s” K: “McDonald’s” K: “McDonald’s” A C Hop-by-hop keyword propagation And, I know I am doing FLOODING!! 49 Now Finding Q: McDonald’s B D K: “McDonald’s” K: “McDonald’s” K: “McDonald’s” A C Search Keyword: “McDonald’s” A might know D’s keyword via two channels (1) Somebody else (2) From its friends Questions: does D need an identity? Scalable? 50 25 51 52 26 53 54 27 Phishing/Hijacking is the default Application Test Q: McDonald’s B D K: “McDonald’s” K: “McDonald’s” K: “McDonald’s” A C Search Keyword: “McDonald’s” Questions: is this the right Felix Wu’s? 55 Application Tests • Example 1: credential-oriented – “PKI certificate” as the keyword – If you can sign or decrypt the message, you are the ONE! • Example 2: service-oriented – Service/protocol/bandwidth support • Example 3: offer-oriented – Please send me your coupons/promotions! 56 28 “Routable Identity” • Application identity =M=> Network identity • Network identity =R=> Network identity • Network identity =M=> Application identity 57 “App/Route Identity” • Application identity =M=> Network identity • Network identity =R=> Network identity • Network identity =M=> Application identity • Keywords =(MF-R)=> “Multiple Paths” • Application identity selection • Network route selection 58 29 Hijackable Routable Identify 59 Application Test ~ “Layer 3” 60 30 Finding Application Test Q: McDonald’s B D K: “McDonald’s” K: “McDonald’s” K: “McDonald’s” A C Search Keyword: “McDonald’s” Questions: is this the right Felix Wu’s? How to avoid/control flooding?? 61 Scalability - Avoid the Flooding • As it is, every keyword will need to be propagated to all the nodes/links (but the same keyword will be propagated through the same link once possibly with different policies). • The issue: “who should receive my keywords?” 62 31 Community-Keyword Model • A Social Peer, P, has 3 keyword sets: – Attributes (ATTR) – Original Keywords (OK) – Propagating Keywords (PK) 63 Community-Keyword Model • Attributes (ATTR) – Keywords describing P (the social node) – Decided/configured by the owner of P • Original Keywords (OK) – Keywords announced by P (the social node) – Decide/configured by the owner of P – Each keyword is associated with a propagation policy (decided by the owner of P) • Propagating Keywords (PK) – From its own OK and other direct neighbors – Each keyword is associated with a propagation policy 64 32 Community-Keyword Model • Attributes (ATTR) – Keywords describing P (the social node) – Decided/configured by the owner of P • Original Keywords (OK) – Keywords announced by P (the social node) – Decide/configured by the owner of P – Each keyword is associated with a propagation policy (decided by the owner of P) • Propagating Keywords (PK) – From its own OK and other direct neighbors – Each keyword is associated with a propagation policy 65 Community-Keyword Model • Attributes (ATTR) – Keywords describing P (the social node) – Decided/configured by the owner of P • Original Keywords (OK) – Keywords announced by P (the social node) – Decide/configured by the owner of P – Each keyword is associated with a propagation policy (decided by the owner of P) • Propagating Keywords (PK) – From its own OK and other direct neighbors – Each keyword is associated with a propagation policy 66 33 in Community of Davis ?? B A D C Who should receive the keyword announcement for “McDonald’s”? 67 as the Social Peer • Attributes: – {McDonald’s Express, 640 W Covell Blvd, # D, Davis, (530) 756-8886, Davis Senior High School, Community Park, North Davis} 68 34 as the Social Peer • Attributes: – {McDonald’s Express, 640 W Covell Blvd, # D, Davis, (530) 756-8886, Davis Senior High School, Community Park, North Davis} • Original Keywords: – {McDonald, Davis, California, DHS, North Davis, Happy Meal, 50% off Tuesday, Lobster} • Propagating Keywords: – {McDonald, Davis, California, DHS, North Davis, Happy Meal, 50% off Tuesday, Lobster, Anderson Plaza, Save-Mart, Taqueria Guadalajara} 69 “Per-Keyword Policy” • For each keyword, we will associate it with a propagation policy: [T, N, A] – T: Trust Value Threshold – N: Hop counts left to propagate (-1 each step) – A: Community Attributes • Examples: – [>0.66, 4, “Davis”] K via L1 – [>=0, , ∅ ] K via L2 ∞ 70 35 in Community of Davis ?? B A D C Who should receive the keyword announcement for “McDonald’s”? 71 Scalability & Controllability • McDonald’s doesn’t want to flood the whole network – It only wants to multicast to the “Target set” of customers • And, it only wants this target set of users being able to use that particular keyword to contact. – Receiver/owner controllability 72 36 Autonomous Community • Each social entity configures a set of “attributes” for itself. • Some or all of the attributes will be exchange with certain neighbors. 73 Social/Community Attributes ?? B A D C Who should receive the keyword announcement for “McDonald’s”? Answer: 74 37 Relevant Attribute/OK/PK • ATTR = Davis • OK = McDonald’s • PK = McDonald’s • The owner uses the “policy” to control the flooding: – K = McDonald’s – [T > 0.66, N = 6, ATTR = “Davis”] 75 IP versus DSL • IP address prefixes announced by BGP to ALL the Autonomous Systems in the whole Internet – Every IP node can send packets to McDonald’s at Davis (if we have a unique IP address) • DSL will only announce “McDonald’s” (under the control of McDonald’s express) within the Davis social community – Only the receivers of the announcement can use the keyword to contact McDonald’s express! 76 38 Community-Keyword Model • A Social Peer, P, has three keyword sets: – Attributes (ATTR) – Original Keywords (OK) – Propagating Keywords (PK) • Flooding Avoidance + Receiver/Owner Control 77 [T >= 0, N = ∞, ATTR = ∅ ] K • What is the consequence? – Spam – Denial of Service • How to deal with it? 78 39 [T >= 0, N = ∞, ATTR = ∅ ] K • Limited Resources on PK – “P” can only remember up to M keywords in its own PK • Ordering Preference between Ki and Kj – T(Ki) > T(Kj) – N(Ki) < N(Kj) – ATTR(Ki) ⊃ATTR(Kj) • Incentive Model – P is willing to pay a price 79 Potential Problems • Mostly only local contacts – Local interests dominate – Possible resource allocation for different ATTRs within the same community 80 40 Community • A connected graph of social nodes sharing a set of community attributes 81 Community ?? B A D C 82 41 Community Control: D C E Who should receive the keyword announcement for “[email protected]”? Answer: Who should receive the keyword announcement fot “South Lake Tahoe Tournament”? Answer: 83 Community • A connected graph of social nodes sharing a set of community attributes 84 42 Community ?? B A D C 85 Social/Community Attributes ?? B A D C Who should receive the keyword announcement for “McDonald’s”? Answer: but not ALL 86 43 Community • A connected graph of social nodes sharing a set of community attributes • The community members can decide the administrative policy within the community – – – – – Membership maintenance Attribute setting Keyword propagation policy (e.g., allocation) Application-dependent policy Incentive model 87 Potential Problems • Mostly only local contacts – Local interests dominate – Possible resource allocation for different ATTRs within the same community • “Reachability” – How likely will my keywords be able to go through to the community I want? – I must be a direct friend of the community – How can we set up “remote long range contact”? 88 44 Community Development • How will each one of us set up our Attributes and Original Keywords plus policy such that together we can communicate with each other “optimally”? – A game theoretical setting problem for network formation 89 Community ?? B A D C 90 45 Network Formation ?? B A D C 91 Network Formation ?? B A D C What is B’s incentive in adding the new ATTR keyword? 92 46 Network Formation ?? B A D C If B adds , then A will add ! 93 Network Formation ?? B A D C Both A & C: why would A & C be willing to establish a direct friendship? 94 47 Open Issues • What is the “value” of this social network? • How would this “value” be distributed and allocated to each individual peers? 95 What is the “value” difference? B A D C B A D C 96 48 “C can join !“ B A D C B A D C 97 “A alone can help C to join more communities!“ B A D C B A D C 98 49 Value Allocation for B ? B A D C B A D C 99 Nash Equilibrium with CS B A 0~30~30 Propagating D C or not? 100 50 Three Person Coalition Game Γ nf (N,v, µ),v = 60u1,2 + 60u1,3 + 60u2,3 −108u1,2,3 Player 2 get “44”! 1 2 Again, players 1 and 3 can collaborate and break their links with 2 to get “30” each from merely “14”! 1 3 2 1 2 3 1 3 2 3 101 Value calculation ∑κ (S)v(S) ≤ v(N) S ∈2 N \{∅} = κ (1,2) × 60 + κ (1,3) × 48 + κ (2,3) × 30 + κ (1,2,3) × 72 1− κ (1,2,3) × (60 + 48 + 30) + κ (1,2,3) × 72 2 1− κ (1,2,3) = × (138) + κ (1,2,3) × 72 2 = 69 + 3 × κ (1,2,3) = ≤ 72 = v(N) 102 51 Open Issues • What is the “value” of this social network? • How would this “value” be distributed and allocated to each individual peers? • DSL, Facebook, LinkedIn didn’t define the “game” for network formation and value allocation. – But, it is important to design the game such that the OSN will eventually converge to a state to best support the communities. 103 Social Network Games 104 52 Let’s come back to SPAM! • How will the proposed DSL model handle spam? • Social Network games can be another major “social spams” to reduce the value of our online social network. 105 Let’s come back to SPAM! • How will the proposed DSL model handle spam? 106 53 [email protected] + ?? B D K: “wu@…” + Policy A C Who should receive the keyword announcement for “[email protected]”? Answer: 107 Even if “A” claims ?? B D K: “wu@…” A C Who should receive the keyword announcement for “[email protected]”? Answer: 108 54 “B” can help… ?? B D K: “wu@…” A C What is B’s incentive? What is B’s risk? 109 Message Value & Prioritization Link Ranks Reputation Incentives Other Trust Metrics Application IDS [good, bad] messages 110 55 111 MessageReaper • A Feedback Control Trust/Reputation system • Trust needs to be maintained along the route path! 112 56 Reputation on Feed-back ?? B A D C “D” is the one to decide whether the message from A/B/C is good or bad! 113 Trust Structure 114 57 Three Trust Values • Ainit: a neighbor sending a message as the first hop. • Afwd: a neighbor sending a message without being the first hop • Art: a neighbor forwarding a message from me which reaches the destination 115 Example 116 58 117 118 59 119 1000 nodes, 20% bad 120 60 1000 nodes, 10%/40% bad 121 Increasing the Spammers 122 61 Orkut (15329 nodes) 123 Collusive Attacks B A D C 124 62 Robustness as OSN “Value” B A D C B A D C 125 Community-Oriented Networking • DSL offers a way to dynamically identify and establish social communities – But, we still have a lot of open issues • Facebook: – Networks: email address dependent – Groups: you have to use your existing social network to invite. 126 63 Davis Social Links over Facebook 127 Smart Proxy • Overlay Social Graph • User-defined keywords and attributes • DSL server • Trust Routing Protocol DSL Facebook 128 64 Sub-communities • Social Graph • User-defined keywords and attributes • DSL server • Trust Routing Protocol DSL Facebook 129 Social Network Development • Social Graph • User-defined keywords and attributes • DSL server • Trust Routing Protocol DSL Facebook 130 65 Component Interactions Attributes Keywords & Policies DSL Profiles Social Graph, Keywords Facebook 131 Route Discovery & Messaging Sender Recipient Keywords, Message Optimal routes DSL Keywords, Message Previous Interaction Outcomes, Shortest Paths Basic Algorithm MessageReaper •Identify destination nodes •Determine Optimal paths •Remove paths that violate keyword policies •If there is a path, store message for recipient 132 66 Antispam email/IM UCD Network Keyword Policy: All UCD Members get keyword ‘[email protected]’ 133 134 67 135 136 68 137 “Bypassing” Facebook • When you send a message… – Via Facebook – Via DSL • Activity and Intensity hiding via Decentralization! DSL Facebook 138 69 DSL vs. Google 139 “Google” • It’s about the “content” – Data-centric networking. • Input to the Engine – A set of key words characterizing the target document. • Output – A set of documents/links matching the keywords 140 70 “DSL” • It’s also about the “content” – Application will decide the mechanism to further the communication. • Input to the Decentralized Engine – A set of key words characterizing the target document (plus the aggregation keywords). • Output – A set of DSL entities with the DSP (Davis Social Path pointer) matching the keywords 141 DSL Search Engine Receiver or Content Sender or Reader DSL Social World We are not just connecting the IP addresses! We are connecting all the contents that can be interpreted! 142 71 Google vs. DSL • Google is essentially a “routing” framework between the contents and their potential consumers. • Google decides how to extract the “key words” from your (the owner) web page or document. 143 Google vs. DSL • Google is essentially a “routing” framework between the contents and their potential consumers. • Google decides how to extract the “key words” from your (the owner) web page or document. • A DSL “owner/receiver to be” has the complete control over that. A balance between: – How I would like others to know about me? • And, I might want different folks to know me in different ways! – How I can differentiate myself from other Felix Wu? 144 72 DSL is an old idea! A B We, as human, have been using similar communication principles. Maybe it is a good opportunity to re-think about our cyber communication system. Identity is a per-application, contextoriented, and sometime relative issue. Forming cyber communities of interests for application. A F F F B 145 LinkedIn: Get Introduced 146 73 Another one 147 DSL, Facebook, AL-BGP and GENI http://www.geni.net/DSLport AL-BGP over GENI/PlanetLab Each DSL/FB user should select a “closer” GENI entrance as www.geni.net. In other words, we might need to set up DNS records correctly. Facebook 148 74 DSL Architecture Applications with Tests DSL AL-BGP 149 Link Applications with Tests 2 1 3 4 150 75 AS-oriented Social Mapping Applications with Tests 151 Control versus Data Path Applications with Tests control path 2 1 data path 152 76 Social-Control Routing Applications with Tests 3 2 1 153 DSL is still an old idea! A B Many applications already have “social network like” structure to enable P2P sharing across Internet. e.g., media sharing, on-line game, restaurant recommendation,… Should we push these into a generic Social Network layer-3 to support all the applications? A F F F B 154 77 A Different Internet?! • Current Internet: every IP address will be able to communicate with every other IP address! – Allow by Default • DSL-based “Internet”: we have a large number of “pairs” (two entities and their corresponding direct social link) – Deny by Default 155 The Physical Pipe • Facebook, Overlay ~ no problem… • Can we do better? 156 78 Comparison • IP/email: – Convergence to an absolute consistent state – IP/email addresses are all you need, but the controllability is biased toward the sender • DSL: – Convergence to a relative consistent state – No global network identity. Every DSL entity defines its own relative identity based on origin keywords. – Controllability is more balanced with other application challenges. 157 Easy to Send & Receive • Easy for both the good users and the spammers. (fair simplicity) • The spammers abuse the “sending” right, while the good users have very limited options to counter back. – how easy can we change our email address? – how often do we need to do that? • A “receiver” or “the owner of the identity” should have some control. – But, that means also “burden” to the users. 158 79 Easy to Send & Receive • Easy for both the good users and the spammers. (fair simplicity) • The spammers abuse the “sending” right, while the good users have very limited options to counter back. – how easy can we change our email address? – how often do we need to do that? • A “receiver” or “the owner of the identity” should have some control. – But, that means also “burden” to the users. 159 Davis Social Links • Peer-to-Peer System (P2P) – How human socially communicate? • Online Social Network (OSN) – How to utilize OSN to enhance communication? – How to have a securer OSN? • Autonomous Community (AC) – How to build/develop more effective community-based social networks? 160 80 Acknowledgement A A B F F F B 161 81