您的当前位置:首页正文

Scheduling with QoS in parallel IO systems

2022-02-22 来源:爱够旅游网
SchedulingwithQoSinParallelI/Osystems

AjayGulati

DepartmentofComputer

ScienceRiceUniversityemail:gulati@rice.edu

Abstract—ParallelI/Oarchitecturesareincreasinglydeployedforhighperformancecomputingandinshareddatacenters.IntheseenvironmentsitisdesirabletoprovideQoS-basedallocationofdiskbandwidthtodifferentapplicationssharingtheI/Osystem.Inthispaper,weintroduceamodelofdiskbandwidthallocation,andprovideefficientschedul-ingalgorithmstoassignthebandwidthamongtheconcurrentapplications.

I.Introduction

ParallelI/Oarchitecturesconsistingofmulti-pledisksconnectedwithhighbandwidthin-terconnectarethenorminhigh-performancedatacentersandsupercomputinginstallations.Dataisdistributedamongthediskstoen-ablesimultaneousparallelaccess.IndividualapplicationscanpotentiallyspeeduptheirI/Obyfetchingdatablocksinparallelfrommultipledisks,andbufferingtheminmemoryuntilrequired.Thebenefitofprefetchingde-pendonseveralfactorssuchasthetemporaldistributionofdiskaccesses,theamountofbuffermemoryavailabletosmoothoutun-evendistributionsofrequests,andtheamountoflookaheadavailable.Asubstantialbodyofworkexistsontheproblemofprefetchingfrommultipledisks,dealingbothwithissuesofaccesspredictionandschedulingatthesystemlevel[15],[16],[20],[22],aswellasschedulingalgorithmsandtheiranalyses[1],[11],[13],[14].Manyoftheseissuesarefairlywellunderstoodatthistime.

Whenseveralconcurrentapplicationsaresi-multaneouslysharingtheI/Osystem,theschedulingoftheI/Osbecomesmorecompli-cated.Twoissuesofconcerninthissituationaremaximizingutilizationofthediskband-

PeterVarmanDepartmentofECE&

andCSRiceUniversityemail:pjv@rice.edu

widthbymultiplexingitamongtheconcur-renttasks,andavoidingstarvationofindivid-ualtasks.Evenwhenconsideredseparatelytheproblemsarechallenging.Forinstance,incontrasttothecaseofasingleaccesssequenceforwhichefficientschedulingalgo-rithmshavebeenfound[1],[11],[13]–[15],theproblemofmaximizingthediskband-widthutilizationwhileschedulingasetofn,n>1,referencestringscanbeshownNP-complete[9].Besidesbeingcomputationallyintractableandrequiringaprioriknowledgeoftheentiresetofaccesssequences,suchaminimal-lengthschedulemaybeunattractivefromtheviewpointoffairnessintheallo-cationofthediskbandwidthtoindividualapplications.

InthisextendedabstractwepresentamodelforfairdiskbandwidthallocationamongconcurrenttasksinaparallelI/Osystem,andpresentefficientschedulingalgorithmsforimplementingseveralallocationpolicies.Bynecessity,ourmodelisanidealizedab-stractionofacomplexparallelI/Osystemwheremechanicaldiskdelaysandintercon-nectschedulingissuesneedtobeconsidered.Manyofthesefactorscanbecapturedbyamoredetailedmodel;thisworkwillcon-centrateonthehigher-levelschedulingissues.Moredetailswillbeprovidedinthecompletepaper.

Therestofthepaperisorganizedasfollows.InSectionIIwedescribethemodelandtheattendantdefinitions.InSectionIIIandIVwepresentouralgorithmsforschedulingwithfairnessandweighted-QoS.SectionVcontainssomepreliminarysimulationresults.

WereviewrelatedworkinsectionVIandendwithsomeconclusionsinsectionVII.

II.Overview

AparallelI/OsystemconsistsofasetofdindependentdisksD={D1,D2,···,DDataisstoredonthedisksinunitsofblocksd}.;ablockistheunitofaccessfromadisk.IneachparallelI/Ooperationasetofuptodblocks,onefromeachdisk,canbeaccessed.Theblocksfetchedfromthedisksmaybebufferedinaninternalmemorybufferuntiltheyarerequired.PrefetchingcangreatlyreducetheoverallI/Olatency.

AsetofnindependentapplicationsortasksisassumedtobeconcurrentlyaccessingtheI/Osystem.EachtaskisabstractedbyareferencestringRi,1≤i≤n;Riistheorderedsequenceofblocksthatisrequiredbythatapplication.ForeachreferencestringthesystemhasalookaheadofL−1blocks,L≥1,beyondthecurrentrequestfromthatstring.Thislookaheadenablestheapplicationtoprefetchblocksthatwillbeaccessedinthenearfutureandholdtheminthebufferuntilrequired.AnI/OscheduleconsistsofasequenceofparallelI/Osteps:ineachstepatmostoneblockfromeachoftheddisksisaccessed.AnI/OscheduleissaidtobeworkconservingifeveryI/Ostepemploysmaximalparallelism:thatis,adiskisbusyunlessthereisnorequestforthatdiskorthereisnotenoughbufferspacetoholdtheaccessedblock.InthispaperwewillassumethatthebufferislargeenoughtoholdthelookaheadLofallthestrings:henceinaworkconservingschedule,theonlyidledisksarethosewithnorequestsatthatstep.Underthisassumptionoflimitedlookahead,ithasbeenshownthataworkconservingscheduleminimizesthenumberofI/Ostepsrequiredforanysinglereferencestringinisolation[13].

AparallelI/Ostepwillberepresentedbya

fetchvector,F=[b1,b2,···,bn],wherebiisthenumberofblocksfromreferencestringRithatarefetchedinthatI/Ostep.NotethatsinceatmostdblockscanbefetchedinanyparallelI/Ostep,∑1≤i≤nbi≤d.Thecumu-lativefetchvector,CF=[BB1,B2,···,Bn],whereiisthetotalnumberofblocksfromreferencestringRifetchedsofar.NotethatCFcanbeobtainedbythecomponent-wiseadditionofthefetchvectorsforeachoftheprecedingI/Osteps.Forweightedal-locationofbandwidth,weletwiindicatetherelativepriorityassignedtoareferencestringRi.TheproportionatevectorVp=[v1,v2,···,vn]wherevi=Bi/wi.Finally,werefertothesumofthecomponentsofavectorasitsweight.

WewillbeinterestedinconstructingworkconservingschedulessincesuchscheduleswillmaximizethediskutilizationateveryI/Ostep.Inaddition,wewouldlikethescheduleateachsteptobefair:byfairnessweintuitivelymeanthatwetrytofetchasevenlyaspossibleforallthereferencestrings.Formally,wesaythatthescheduleatastepisfairifthefetchvectorFatthatstepislex-icographicallysmallestamongalln-vectorswiththesameweight.Weformallystatethedefinitionoflexicographicminimumbelow:ConsidertwovectorsF=[f1,f2,···fn]andG=[g1,g2,···,gn],suchthat∑f,ifi=∑foralligi,andthati≥fi+1andgi≥gi+11≤i≤n−1.ThenFislexicographicallysmallerthanGifandonlyifthereissomeindexk,k≥1,suchthatfi=gi,1≤i≤k−1andfgfollowingkWedefinethreedifferentpoliciesforschedul-ing.

1)LocallyFairAllocation:AchieveafairscheduleateachI/Ostepthatisworkconserving.

2)GloballyFairAllocation:Achieveafairwork-conservingschedulebasedonthecumulativenumbersofblocksac-cessedbyeachstring.

3)WeightedAllocation:Achieveaworkconservingscheduleinwhichthecumu-lativenumberofblocksaccessedafteranystepareinproportiontotherelativeprioritiesofthereferencestrings.

A.OverviewoftheAlgorithms

Atanytime,thereareasetofcandi-dateblocksfromeachreferencestringthatcanbefetchedinthisI/Ostep.Thesearetheunfetchedblocksinthelookaheadforthatstring.Thesystemismodeledbyan(V󰀁augmented{s,t},E󰀁bipartiteEt󰀁

resourcegraphG=Es)definedasfollows:

V={D1,D2,···,Dd}󰀁

{R1,R2,···,Rn}isasetofn+dnodes,oneforeachreferencestringanddiskinthesystem.•

Eisthesetofdirectededgesbetweennodesrepresentingreferencestringsandnodesrepresentingdisks:thereisanedge(Ri,Dj)wheneverthereisarequesttodiskDjinthecurrentlookaheadwindowofreferencestringRi.

Distinguishedverticessandtwillserveasthesourceandsinkofpathsthroughthegraph.

•Et={(Dj,t),1≤j≤d},isthesetofdirectededgesfromeachdisknodetot.•

Es⊆{(s,Ri),1≤i≤n},isasubsetofthedirectededgesfromstostringnodesRi;thissubsetchangesdynamicallyasthealgorithmproceeds.

Wewillshowthattheschedulingproblemsdefinedabovecanbemappedtofindingasetofpathsinadynamicallyevolvingresource

graph.InitiallytheresourcegraphconsistsofGdefinedabovewithEsbeingempty.Ouralgorithmswillmaintainapriorityvector,P=[p1,p2,···,pn]:piisthepriorityforstringnodeRi.Ateachstepthenodewithhighest-priorityisselected,andanedgefromstothisnodeisaddedtoEs.Thealgorithmthenattemptstofindapathinthecurrentresourcegraphfromstotusingtheedgebetweensandtheselectednode.

Ifapathcannotbefoundthroughthecur-rentlyselectednodeRi,thenthenodeismarkedassaturated,andthealgorithmwilladjustthepriorityofRisothatitwillnotbeselectedagain.Asaturatednodemeansthatitisnotpossibletoincreasethetotalnumberofscheduleddisksbymakingafurtherassign-menttothisreferencestring(i.e.theweightofthefetchvectorwillnotincrease).Anassignmenttoasaturatednodewillcomeattheexpenseofreducingtheassignmenttooneofthepreviouslyschedulednodes.Thechoiceofprioritiesissuchthatthisreallocationisundesirable.

Ifthesearchforapathissuccessfulthismeansthattheweightofthefetchvectorcanbeincreasedbyassigninganadditionaldisktotheithstring.Thepreviousassignmentofdiskstostringsmightgetchangedtomakethispossible,butthenumberofdisksassignedtoanyotherstringwillnotchangebythisreassignment.Thustheweightofthefetchvectorisincreasedwithoutdisturbingtherelativeallocationsneededforfairness.Theresourcegraphismodifiedtoreflectthenewassignmentsasdescribedbelow.Thisstepwillbereferredtoaspathconditioninginthedescriptionofthealgorithms.

PathConditioning:Inthecurrentresourcegrapheveryedge(Ri,Dj)inthediscoveredpathisreplacedbyitsreverseedge(Dj,Ri),andsimilarlyeveryedge(Ds,Rt)inthepathisreplacedbythereverseedge(Rt,Ds).Thelastedgeonthediscoveredpathfromsome

Dutotisalsoreplacedbythereverseedge(t,Du).

Theinvariantmaintainedbythealgorithmisthatthepresenceofanedge(Ds,Rt)intheresourcegraphatthestartofaniterationmeansthatcurrentlydiskDsisassignedtostringRt.SupposethepathgoingthroughselectednodeRuatthisiterationis(s,Ru,Dj1,Rj1,Dj2,Rj2,···,Djimpliesthatatthestartk,Rjofthek,Djiterationk+1,t).Thistherewerekassignments:diskDjtostringRWheniwasassignedji,1≤i≤k.theedgesbetweenstringanddisknodesarereversed,thek+1newassignmentswillbe:Dj1toRu,DjD2toRj1,andsoonendingwithjk+1assignedtoRjk.NotethatforeachofthestringverticesRjionlytheidentityoftheassigneddiskwaschanged.Notethatalthoughrelatedtotheproblemsofbipartitegraphmatchinganddeterminingmaximumflow[7],therearesubtledifferencesthatprecludedirectapplicationofeitherofthesealgorithmstoourproblem[9].OuralgorithmisinspiredbytheideasofaugmentingpathsandresidualgraphsemployedintheFordFulkersonalgorithmformaximumflowinanetwork,buthasbeensuitablyrefinedfortheproblemathand.

III.Localfairness

Problemdefinition:Findanassignmentofdiskstoreferencestringssuchthatthefetchvector,F=[b1,b2,...,bn]hasmaximalweightandislexicographicallyminimum.

IntheresourcegraphGdefinedearlierletD

ˆbethenumberofdisknodesthathaveatleastoneincidentedge.Thenthemaximalweight

ofthefetchvectorisD

ˆ,i.e.awork-conservingschedulewillassignD

ˆdisks.A.LFSalgorithm

AstraightforwardalgorithmtoobtainlocalfairnesscantryallassignmentsorderedpartitionsofD

ˆmadeupof

intoncomponents,inincreasinglexicographicorderuntilafeasiblescheduleisfound.However,therunningtimeofsuchanalgorithmwouldbethereareθ(D

ˆnexponentialas

)possiblepartitionsofDˆintonparts,wheretheorderofpartsisimportant.ThealgorithmLFSshowninfigure1givesthemaximalweight,lexicographicallymin-imumfetchvectorinO((n+D)|E|)time.Thealgorithmhasthestructuredescribedinoverview.Thepriorityvectorbeginswithallcomponents0;ateveryiterationeithertheweightofthepriorityvectorincreasesby1,oronecomponent(stringnode)issaturated.The(nonsaturated)componentwithsmallestvaluewillhavethehighestpriorityinthevectorP.Theuseofthedynamicgraphcreatedbythepathconditioningstepensuresthattheassignmentsofdisksatsomestagedonotprecludereassignmentatalaterstage.ThisobservationandthepruningofsaturatednodesallowsthenomorethanD

ˆalgorithmtoterminatein

+niterations;eachiterationrequiresanO(|E|)pathsearchingalgorithm.WepresentsomelemmasrelatedtoLFSal-gorithm(theproofsareomittedduetolackofspace).

Lemma1:Onceadisknodehasbeenas-signedtosomestringnodeitwillneverbecomefree(althoughitmaybereassignedtoadifferentstringnode).

Lemma2:LFSfindsthecographicalallocationofD

ˆminimumlexi-diskblockstoreferencestrings.

Lemma3:TherunningtimeofLFSalgorithmisO((n+D)|E|).

IV.Global

FairnessandQoS

Scheduling

Thelocalfairnessmetricaimstodistribute

diskaccessesfairlyateveryI/Ostep.Therearesituationswheresuchanapproachmaybeinadequate.Ifanapplicationhasaburstofrequeststojustafewdisks,itwillreceivea

(1)(2)(3)(4)(5)(6)

(7)(8)(9)(10)(11)(12)

Fig.1.

LFS(LocallyFairScheduling)algorithm:

ofblocksaccessed,sothatstringswhichhave

weight=0

thelargestslack(orminimumBi)atanystepGisaugmentedresourcegraph,withEs=φaregivenachancetocatchup(instep6).TheLetpriorityvectorP=[0,0,...,0].

Markallelementsofthevectorasnon-saturated.restofthealgorithmremainsthesame.The

ˆwhile(weightChoosethelowest-valuednon-saturated

elementpiofPwithtiesbrokenarbitrarily.Addanedge(s,Ri)

A.WeightedallocationFindanypathfromstotthatincludesthe

edge(s,Ri).

if(nopathisfound)

ForQoS,wewanttoassignunequalpro-MarkpiassaturatedinP.

portionsofresourcestothetasksneedingelse

them.Westudytheproblemofassigningdiskweight=weight+1;pi=pi+1

UpdategraphGbypathconditioningbandwidthinaskewedmannertomultiple

O((n+D)|E|)algorithmforlocalfairscheduling

smallerfractionofthebandwidth;itmaybe

desirabletofavorthesestringsoncethehotspotactivitypasses.Inothercases,asystem-aticalbeitsmalldifferenceinlocalallocationateachstep,mayresultinasignificantspreadofthecumulativebandwidthallocationforlongrunningapplications.Inordertohandlesuchsituationswedefineaglobalfairnesscriterion,anddescribehowwecanadaptourbasicalgorithmforthispurpose.Intuitivelytheglobally-sensitivealgorithmcalledGFS,keepsthehistoryofaccessesinthecumu-lativefetchvectorCF.InthecurrentI/Oassignment,GFSwilltryandequalizethecomponentsofCF,byfavoringstringswhicharelagginginthetotalnumberofblocksfetchedsofar.

TheoutlineofthealgorithmisquitesimilartotheLFSalgorithminFigure1.AsinLFS,weighttracksthenumberofdisksallocatedatanystepofthealgorithm.WeletCF=[B1,B2,···,Bn]betheCFvectoratthestartoftheI/Ostep.InitializethepriorityvectorP=[p1,p2,···,pn],wherepi=Bi(instep4):pitracksthetotalnumberofblocksthathavebeenfetchedfortheithstring.Inthemainloop,stringsarescheduledinincreasingorder

referencestrings.Lettheproportionatevec-torVp=[v1,v2,···,vn],wherevi=Bi/wiforithstring.Theproblemcanbestatedmoreformallyasfollows:

Problemdefinition:Findanassignmentofdiskstoreferencestringsthatmaximizestheweightofthefetchvectorandlexicographi-callyminimizestheproportionatevectorVp=[v1,v2,···,vn].

ThealgorithmforweightedallocationcalledWeAB,keepsthehistoryofaccessesinthecumulativefetchvectorCF.InitializethepriorityvectorP=[p1,p2,···,pn]aspi=Bi/wi.Intuitivelypiindicatetheproportionatebandwidthgiventotheithstringsofar.InthecurrentI/Ostep,WeABwillassigndiskstotryandequalizethecomponentsofP,byfa-voringstringswhicharelaggingintheirpro-portionofblocksfetched.NotethatassigningadisktoRiincreasespiby1/wiinsteadof1.Inthemainloop,stringsarescheduledinde-creasingorderofproportionateslack,sothatstringswhichhavethelargestproportionateslackatanysteparegiventhefirstchancetodecreasetheirslack.Mathematically,thestringwiththeminimumvalueof(Bi+1)/wihasthehighestpriority.Ateachstepeitherastringnodegetssaturatedortheweight

ˆdisksareincreasesby1.ItcontinuesuntilD

scheduled.Amoreformaldescriptionofthealgorithmisgiveninfigure2.TheWeABalgorithmalsorunsinO((n+D)|E|)time.

(1)(2)(3)(4)(5)(6)

(7)(8)(9)(10)(11)(12)

Fig.2.

WeABalgorithm:weight=0

GisaugmentedresourcegraphwithEs=φLetpriorityvectorP=[p1,p2,···,pn],wherepi=Bi/wi.Markallpi’sasnon-saturated

ˆwhile(weightChoosethenon-saturatedelementpiof

Psuchthat(Bi+1)/wiisminimumwithtiesbrokenarbitrarily.Addanedge(s,Ri)

Findanypathfromstotthatincludestheedge(s,Ri).

if(nopathisfound)

MarkpiassaturatedinP.else

weight=weight+1;Bi=Bi+1;pi=Bi/wiUpdategraphGbypathconditioning

O((n+D)|E|)algorithmforweightedallocation

Bandwidth vs Number of IOs, D=64, N=4

0.4

Reference string 1Reference string 2Reference string 3Reference string 4

0.35

Bandwidth allocated 0.3

0.25

0.2

0.15

0.1

0 50 100 150 200IO number

250 300 350 400

Fig.3.GFS:fairallocationfor4strings

Bandwidth vs Number of IOs, D=64, N=3

0.5

Reference string 1(0.5)Reference string 2(0.3)Reference string 3(0.2)

0.45

V.Results

Weevaluatethealgorithmsforglobalandproportionalallocationbysimulations.Forglobalfairness,weperformedanexperimentwith4referencestrings(generatedrandomly)and64disks.Theamountoflookaheadisequaltonumberofdisksforeachreferencestring.Figure3showstheactualdistributionofbandwidthamongthestrings.Thefigureconfirmsthatthebandwidthisequallydis-tributedamongallfourstrings.Initialfluctu-ationsarerelatedtothedistributionofdiskrequestsinthereferencestringsandtheworkconservingnatureofthealgorithm,thatpre-ventsanydiskfromidlingifsomerequestforthatdiskcanbescheduled.Forweightedallocation,weperformedexperimentswith3randomly-generatedreferencestringsand64disks.StringsR1,R2andR3havebeenassignedrelativeprioritiesof0.5,0.3and0.2respec-tively.Figure4showstheactualbandwidthallocatedtothestringsusingWeABalgo-rithm.TheresultshowninthefigureindicatesthatWeABalgorithmallocatesdifferentiatedbandwidthtovariousstringsaccordingtotheweightsassignedtothem.

Wealsoexperimentedwithskewedinput

Bandwidth allocated 0.4

0.35

0.3

0.25

0.2

0.15

0 20 40 60 80

100IO number

120 140 160 180 200

Fig.4.WeABwithweights0.5,0.3,0.2

models.Wearecurrentlyextendingtheevalu-ationtoincorporatephysicaldiskparametersandvariableaccesstimes.Amorecompre-hensiveevaluationofthealgorithmswillbepresentedinthefullpaper.Wealsoex-perimentedwithskewedinputmodelsandwewillbepresentingamorecomprehensiveevaluationofalgorithmsinthefullpaper.

VI.Relatedwork

Severaldiskschedulingpolicieshavebeenproposedforsingleaswellasmultipledisks.Haritsaetal.[10]havepresentedefficientandfairschedulingalgorithmsforasingledisk.Theytrytoachievefairnessatthecylinder

levelbyproposingstrategiesfordiskheadmovementafterarequesthasbeenserviced.TheYFQ[4]algorithmwhichisanextensiontogeneralizedprocessorsharing(GPS)modelofproportionalresourceallocation,allowsapplicationstoreserveafixedshareofdiskbandwidth.Cello[21]providesaschedulingframeworkforaheterogeneousmixofappli-cationsaccessingthedisk.Itemploysa2-levelschedulingframework,whererequestsfromapplicationsareputintodifferentqueuesbasedontheirneedsatthefirstlevelandacommonschedulerreorderstherequestsselectedfromthesequeuestominimizeseekandrotationaldelaysatthesecondlevel.Ourapproachhowever,providesbandwidthallocationacrossmultipledisks.Bybatchingrequeststhatbelongtoasinglediskmanyofthelowerleveloptimizationssuggestedintheseworkscanbeemployedalongwithourhigherlevelframework.

Lumbetal.presentedFacade[6]thatisanapproachtofulfillServiceLevelObjective(SLO)ofindependentworkloadsaccessingastoragesystem.Facadeprovidesadynamictrade-offbetweentheIOrateandaveragelatencybyreal-timeschedulingandfeedbackbasedcontrolofdiskqueuelengths.Theirap-proachassumestheavailabilityofacapacityplannerforadmissioncontrol.OurapproachprovidesafinercontroloverallocationofdiskstoworkloadsineachI/Otoachievethedesiredbandwidthallocation.Interposedproportionalsharingalgorithmsaresuggestedin[12]thatassignvirtualstartandfinishtimestorequestsbasedontheircost,clientqueueandarrivaltime.In[12],theserverisconsideredasasingleresourcesharedbyalltheclients,whereasweattainfairnessformultipleinputstreamsrequestingmultipleresources(disks).

ManyschemeshavebeendevelopedforfairschedulinginInternetrouters.SchemessuchasiSLIP[17],shakeup[8]trytoachieve

highthroughputandfairnessbylookingformaximalbipartitematchingbetweeninputsandoutputsateachpointofscheduling.iFS[18]providesfairnessandQoSguaranteesbyassigningvirtualstartandfinishtimestoeachpacket.Thesealgorithmstrytoobtainan1-to-1mappingbetweeninputandoutputportsinsteadof1-to-manymappingrequiredforourproblem.Iterativelyapplyingbipartitematchingdoesnotguaranteeafairscheduleinourcasebecauseitgivesanirrevocablemappingthatcannotbechangedinfutureiterations.Manyotherarchitectureshavebeenproposedfordifferentiatedallocationofband-widthtoflowssuchasdiffservs[19]andQoSBox[5].NoneofthemcanbeusedforQoSinautomatedstoragesystems[23].Asstatedin[23],performanceofstoragesystemsdependsalotonthecurrentstateandstorageprotocolsarenotamenabletopacketdroppingandtrafficshaping.

AutomatedtoolssuchasHippodrome[3],Minerva[2]canbeusedtodesignstoragesystems,ifthecapacityrequirementsandworkloadcharacteristicsareknownapriori.Thesetoolsgothroughtheiterativeprocessofdesigningandevaluatingthesystem.Oncetheworkloadsoranyotherrequirementchangesthewholeprocessneedstoberepeatedagain.Ourapproachcanhandleshorttermfluctua-tionsandprovideweighted-QoSeveninthepresenceofunpredictableworkloads.

VII.Conclusions

InthispaperweexploredschedulingschemestoprovidebandwidthallocationinparallelI/Osystems.LFSalgorithmobtainslocalfairnessateachI/Ostep.GFSalgorithmprovidesfairallocationoverlargertimewindowsbykeepingahistoryofblocksaccessedbyeachstring,andWeABisageneralschemetoobtainanI/Oschedulewithprioritizedallo-cationofdiskbandwidthtovariousstrings.WeABcansupportaneconomicmodelof

servicesprovidedbyadatacenterthatneedtoprovidedifferentiatedbandwidthtovar-iouscustomers.Wedemonstratedviasim-ulationsthattheWeABalgorithmmatchesthebandwidthallocatedtovariousapplicationtotheirdesiredpriotiryandGFSobtainsafairschedule.Allthesealgorithmsareworkconservingandprovidehighthroughput.Weplantoimplementouralgorithmsinanactualsystemtogetabetterideaaboutperformanceandscalabilityofourapproachinfuture.

Acknowledgments

SupportforthefirstauthorwasprovidedbytheNationalScienceFoundationunderGrantCCR-0105565.ResearchtimeduringIPAas-signmentofthesecondauthorwasprovidedbytheNationalScienceFoundationundertheIRDprogram.

References

[1]S.Albers,N.Garg,andS.Leonardi.Minimizingstall

timeinsingleandparalleldisksystems.J.ACM,47(6):969–986,2000.

[2]G.A.Alvarezandetal.Minerva:anautomated

resourceprovisioningtoolforlarge-scalestoragesys-tems.InACMTransactionsonComputerSystems,pages483–518,November2001.

[3]E.Andersonandetal.Hippodrome:runningcircles

aroundstorageadministration.InFileandStorageTechnology(FAST’02),pages175–188,January2002.[4]J.Bruno,J.Brustoloni,E.Gabber,B.Ozden,and

A.Silberschatz.Diskschedulingwithqualityofserviceguarantees.InProceedingsoftheIEEEInternationalConferenceonMultimediaComputingandSystemsVolumeII-Volume2,page400.IEEEComputerSociety,1999.

[5]N.ChristinandJ.Liebeherr.TheQoSbox:APC-RouterforQuantitativeServiceDifferentiationinIPNetworks.Technicalreport,2001.

[6]A.M.ChristopherLumbandG.Alvarez.Facade:

Virtualstoragedeviceswithperformanceguarantees.FileandStoragetechnologies(FAST’03),pages131–144,March2003.

[7]T.H.Cormen,C.E.Leiserson,andR.L.Rivest.

IntroductiontoAlgorithms.TheMITPress,pages579–598,1999.

[8]M.W.Goudreau,S.G.Kolliopoulos,andS.B.

Rao.Schedulingalgorithmsforinput-queuedswitches:randomizedtechniquesandexperimentaleval-uation.IEEEINFOCOM,3:1634–1643,March2000.

[9]A.Gulati.SchedulingwithQoSinparallelI/Osystems.

Master’sthesisinprogress,RiceUniversity,June2004.[10]J.R.HaritsaandT.Pradhan.Fairdiskschedulers

forhighperformancecomputingsystems.Proc.ofInetrnationalConf.onHighPerformanceComputing,December1995.

[11]D.A.Hutchinson,P.Sanders,andJ.S.Vitter.Duality

betweenprefetchingandqueuedwritingwithparalleldisks.InProceedingsofthe9thAnnualEuropeanSym-posiumonAlgorithms,pages62–73.Springer-Verlag,2001.

[12]W.Jin,J.S.Chase,andJ.Kaur.Interposedproportional

sharingforastorageserviceutility.SIGMETRICSPerform.Eval.Rev.,32(1):37–48,2004.

[13]M.KallahallaandP.Varman.Optimalread-once

paralleldiskscheduling.Proc.6thACMWorkshoponI/OinParallelandDistributedSystems(IOPADS’99),April1999.

[14]M.KallahallaandP.J.Varman.Optimalprefetching

andcachingforparallelI/Osystems.In13thACMSymposiumonParallelAlgorithmsandArchitectures,pages219–228,July2001.

[15]T.Kimbrel,P.Cao,E.W.Felten,A.R.Karlin,and

K.Li.Integratedparallelprefetchingandcaching.InProceedingsofthe1996ACMSIGMETRICSinter-nationalconferenceonMeasurementandmodelingofcomputersystems,pages262–263.ACMPress,1996.[16]D.KotzandR.Jain.I/Oinparallelanddistributed

systems.InA.KentandJ.G.Williams,editors,EncyclopediaofComputerScienceandTechnology,volume40,pages141–154.MarcelDekker,Inc.,1999.[17]N.McKeown.TheiSLIPSchedulingAlgorithmfor

Input-QueuedSwitches.IEEE/ACMTransactionsOnNetworking,7:188–201,April1999.

[18]N.NiandL.N.Bhuyan.Fairschedulingininternet

routers.IEEETransactionsOnComputers,pages686–701,June2002.

[19]K.Nichols,S.Blake,F.Baker,andD.Black.Anar-chitecturefordifferentiatedservices.Technicalreport,December1998.

[20]R.H.Patterson,G.A.Gibson,E.Ginting,D.Stodolsky,

andJ.Zelenka.Informedprefetchingandcaching.InProceedingsofthefifteenthACMsymposiumonOp-eratingsystemsprinciples,pages79–95.ACMPress,1995.

[21]P.J.ShenoyandH.M.Vin.Cello:adiskscheduling

frameworkfornextgenerationoperatingsystems.InProceedingsofthe1998ACMSIGMETRICSjointin-ternationalconferenceonMeasurementandmodelingofcomputersystems,pages44–55.ACMPress,1998.[22]G.A.S.Whittle,J.-F.Paris,A.Amer,D.D.E.Long,

andR.Burns.Usingmultiplepredictorstoimprovetheaccuracyoffileaccesspredictions.In20thIEEEConferenceonMassStorageSystemsandTechnologies,April2003.

[23]J.Wilkes.Travelingtorome:Qosspecificationsforau-tomatedstoragesystemmanagement.InInternationalWorkshoponQoS,pages75–91,June2001.

因篇幅问题不能全部显示,请点此查看更多更全内容