编辑: 丶蓶一 | 2015-08-29 |
2 Adaptive?Stochastic?Optimization The?model: ? Known?distribution?D ? Unknown?realized?input?X ~?D Solution?(policy)?is?adaptive?sequence?of?actions? ? Each?action?reveals?extra?information?on?X Goal:??policy?having?best?expected objective Here:?maximization?objective 1.
Stochastic?knapsack 2. Stochastic?matching Stochastic?Knapsack Recall:?deterministic?knapsack?problem ? Jobs?with?reward?and?size;
?budget?B ? Maximize?reward?s.t.?total?size?≤ B Stochastic?knapsack? [Dean?Goemans?Vondrak?'04] ? Job?sizes?are?random Known,?arbitrary?distributions Independent Exact?size?known?only?after?selection ? Deterministic?rewards ? Maximize?expected?reward?s.t.?total?size?≤ B r1=2 r2=4 r3=5
5 7
3 B?=?9 OPT=6 r?=?1 S=0 Pr=1/2 S=5 Pr=1/2
3 Example:?Stochastic?Knapsack Feasible?solution:?select?jobs?sequentially r1 =?1 S1=0 Pr=1/2 S1=5 Pr=1/2 r2 =?1 S2=5 Pr=1 r3 =?10 S3=0 Pr=0.1 S3=6 Pr=0.9
1 2
3 3 S1=0 S1=5 S2=5 S3=0 S3=6 S3=0 S3=6
12 2
11 1 E?[?Reward?0.1*12?+?0.9*2]0.1*11?+?0.9*1]?=?2.5 Budget?B?=?5 No?r3 0.5 0.5 0.1 0.9 0.1 0.9
4 Representing?Solutions Decision?tree or?dynamic?program ? nodes represent?residual?budget and?job?chosen ? branches represent?random?size?outcomes ? a.k.a.?adaptive?policy .?.?.?. Eg.?policy?tree Describing?policy?may? take?exponential?space .?.?.?. Eg.?policy?dynamic?program
5 Simpler?Class?of?Policies Non?adaptive policy Add?jobs?in??a?priori fixed?order,?until?budget?exhausted Polynomial?space?representation:?permutation Adaptivity gap captures?loss?in?objective
1 2
3 max?instance?I OPT(AD(I)) OPT(NA(I)) . . .
6 Adaptivity?Gap?Example r1 =?1 S1=0 Pr=1/2 S1=5 Pr=1/2 r2 =?1 S2=5 Pr=1 r3 =?10 S3=0 Pr=0.1 S3=6 Pr=0.9
1 2
3 3 S1=0 S1=5 S2=5 S3=0 S3=6 S3=0 S3=6
1 2
3 12
2 11
1 E[Adaptive]?=?2.5 E?[?NonAdaptive?]?=?2.05 Adaptivity?gap?≈ 1.25
1 2
3 3
2 2
2 0
5 6
6 0
0 5
5 12
1 11
1 7 Why?Non\Adaptive?? Problem?consists?of??"offline"??and??"online"?phases ? Offline?=?before?any?job?is?run ? Online?=?when?jobs?are?running Non\adaptive: ? All?the?work?in?offline?phase ? Online?phase?easy/fast?to?implement
8 . . . .?.?. offline online . . . NA algo AD algo Approximation?Ratio Stochastic?Knapsack?instance?I OPT(I) = max?E?[objective?under?π]?:?π is?policy Contrast?with?online "competitive?ratio"?relative?to: EOPT(I) =?EX←D [optimal?value?on?input?X]? Eg.?n??identical jobs EOPT =?n/2?but?OPT =?2? Approximation?ratio?=?max?instance?I OPT(I) ALG(I) r?=?1 S=0 Pr=1/2 S=B+1 Pr=1/2 Here:?information?gradual?in?both?ALG?&?OPT
9 10 Outline 1. Stochastic?knapsack?(SK)?basics 2. Non\adaptive?algorithm?for?SK 3. Correlated?stochastic?knapsack?(CSK) Non\adaptive?algorithm 4. Adaptive?algorithm?for?CSK 5. Extensions Some?Definitions ? Jobs?[n]?:=?{1,2,?,n} ? ri =?reward of?job?i ? Si =?size of?job?i?(random?variable) ? Budget?B =?1?(by?scaling) ? Capped?mean?size?ei =?E?[?min{Si ,?1}?] ? Effective reward?wi =?ri ・ Pr[Si ≤ 1]
11 LP?Relaxation max? ?i=1 n wi ・ xi s.t. ?i=1 n ei ・ xi ≤
2 0?≤ xi ≤ 1,?????? i=1,?2,?…,?n Theorem:??LP*?≥ OPT? Ti :=?indicator?that?job?i chosen?in?OPT?(may?not?fit) Consider?LP?solution??xi =?Pr[Ti=1] Claim:???i=1 n min{Si ,?1}?・ Ti ≤
2 12
1 1 In?each?decision?path, at?most?one?overflows ends .?.?.?. ? x?∈ LP (Si &?Ti independent) LP?relaxation:?formal?proof ? At :=?set?of?first?t?jobs?chosen?in?OPT???(t=0,1,?,n) A0 =??,?An =?all?jobs?chosen?in?OPT Note?Ti=1?iff i ∈ An ? Yt :=??i∈At (min{Si ,?1}?C ei ) ? Conditioned?on?Yt and?next?job?j?: E[?Yt+1 |?Yt ,?j?]?=?Yt +?E[min{Sj ,?1}]?C ej =?Yt ? Y0 ? Yn martingale,???E[Yn]?=?E[Y0]?=?0 ? i.e.?E[?i∈An ei]?=?E[? i∈An min{Sj ,?1}]?≤