AjayGulati
DepartmentofComputer
ScienceRiceUniversityemail:gulati@rice.edu
Abstract—ParallelI/Oarchitecturesareincreasinglydeployedforhighperformancecomputingandinshareddatacenters.IntheseenvironmentsitisdesirabletoprovideQoS-basedallocationofdiskbandwidthtodifferentapplicationssharingtheI/Osystem.Inthispaper,weintroduceamodelofdiskbandwidthallocation,andprovideefficientschedul-ingalgorithmstoassignthebandwidthamongtheconcurrentapplications.
I.Introduction
ParallelI/Oarchitecturesconsistingofmulti-pledisksconnectedwithhighbandwidthin-terconnectarethenorminhigh-performancedatacentersandsupercomputinginstallations.Dataisdistributedamongthediskstoen-ablesimultaneousparallelaccess.IndividualapplicationscanpotentiallyspeeduptheirI/Obyfetchingdatablocksinparallelfrommultipledisks,andbufferingtheminmemoryuntilrequired.Thebenefitofprefetchingde-pendonseveralfactorssuchasthetemporaldistributionofdiskaccesses,theamountofbuffermemoryavailabletosmoothoutun-evendistributionsofrequests,andtheamountoflookaheadavailable.Asubstantialbodyofworkexistsontheproblemofprefetchingfrommultipledisks,dealingbothwithissuesofaccesspredictionandschedulingatthesystemlevel[15],[16],[20],[22],aswellasschedulingalgorithmsandtheiranalyses[1],[11],[13],[14].Manyoftheseissuesarefairlywellunderstoodatthistime.
Whenseveralconcurrentapplicationsaresi-multaneouslysharingtheI/Osystem,theschedulingoftheI/Osbecomesmorecompli-cated.Twoissuesofconcerninthissituationaremaximizingutilizationofthediskband-
PeterVarmanDepartmentofECE&
andCSRiceUniversityemail:pjv@rice.edu
widthbymultiplexingitamongtheconcur-renttasks,andavoidingstarvationofindivid-ualtasks.Evenwhenconsideredseparatelytheproblemsarechallenging.Forinstance,incontrasttothecaseofasingleaccesssequenceforwhichefficientschedulingalgo-rithmshavebeenfound[1],[11],[13]–[15],theproblemofmaximizingthediskband-widthutilizationwhileschedulingasetofn,n>1,referencestringscanbeshownNP-complete[9].Besidesbeingcomputationallyintractableandrequiringaprioriknowledgeoftheentiresetofaccesssequences,suchaminimal-lengthschedulemaybeunattractivefromtheviewpointoffairnessintheallo-cationofthediskbandwidthtoindividualapplications.
InthisextendedabstractwepresentamodelforfairdiskbandwidthallocationamongconcurrenttasksinaparallelI/Osystem,andpresentefficientschedulingalgorithmsforimplementingseveralallocationpolicies.Bynecessity,ourmodelisanidealizedab-stractionofacomplexparallelI/Osystemwheremechanicaldiskdelaysandintercon-nectschedulingissuesneedtobeconsidered.Manyofthesefactorscanbecapturedbyamoredetailedmodel;thisworkwillcon-centrateonthehigher-levelschedulingissues.Moredetailswillbeprovidedinthecompletepaper.
Therestofthepaperisorganizedasfollows.InSectionIIwedescribethemodelandtheattendantdefinitions.InSectionIIIandIVwepresentouralgorithmsforschedulingwithfairnessandweighted-QoS.SectionVcontainssomepreliminarysimulationresults.
WereviewrelatedworkinsectionVIandendwithsomeconclusionsinsectionVII.
II.Overview
AparallelI/OsystemconsistsofasetofdindependentdisksD={D1,D2,···,DDataisstoredonthedisksinunitsofblocksd}.;ablockistheunitofaccessfromadisk.IneachparallelI/Ooperationasetofuptodblocks,onefromeachdisk,canbeaccessed.Theblocksfetchedfromthedisksmaybebufferedinaninternalmemorybufferuntiltheyarerequired.PrefetchingcangreatlyreducetheoverallI/Olatency.
AsetofnindependentapplicationsortasksisassumedtobeconcurrentlyaccessingtheI/Osystem.EachtaskisabstractedbyareferencestringRi,1≤i≤n;Riistheorderedsequenceofblocksthatisrequiredbythatapplication.ForeachreferencestringthesystemhasalookaheadofL−1blocks,L≥1,beyondthecurrentrequestfromthatstring.Thislookaheadenablestheapplicationtoprefetchblocksthatwillbeaccessedinthenearfutureandholdtheminthebufferuntilrequired.AnI/OscheduleconsistsofasequenceofparallelI/Osteps:ineachstepatmostoneblockfromeachoftheddisksisaccessed.AnI/OscheduleissaidtobeworkconservingifeveryI/Ostepemploysmaximalparallelism:thatis,adiskisbusyunlessthereisnorequestforthatdiskorthereisnotenoughbufferspacetoholdtheaccessedblock.InthispaperwewillassumethatthebufferislargeenoughtoholdthelookaheadLofallthestrings:henceinaworkconservingschedule,theonlyidledisksarethosewithnorequestsatthatstep.Underthisassumptionoflimitedlookahead,ithasbeenshownthataworkconservingscheduleminimizesthenumberofI/Ostepsrequiredforanysinglereferencestringinisolation[13].
AparallelI/Ostepwillberepresentedbya
fetchvector,F=[b1,b2,···,bn],wherebiisthenumberofblocksfromreferencestringRithatarefetchedinthatI/Ostep.NotethatsinceatmostdblockscanbefetchedinanyparallelI/Ostep,∑1≤i≤nbi≤d.Thecumu-lativefetchvector,CF=[BB1,B2,···,Bn],whereiisthetotalnumberofblocksfromreferencestringRifetchedsofar.NotethatCFcanbeobtainedbythecomponent-wiseadditionofthefetchvectorsforeachoftheprecedingI/Osteps.Forweightedal-locationofbandwidth,weletwiindicatetherelativepriorityassignedtoareferencestringRi.TheproportionatevectorVp=[v1,v2,···,vn]wherevi=Bi/wi.Finally,werefertothesumofthecomponentsofavectorasitsweight.
WewillbeinterestedinconstructingworkconservingschedulessincesuchscheduleswillmaximizethediskutilizationateveryI/Ostep.Inaddition,wewouldlikethescheduleateachsteptobefair:byfairnessweintuitivelymeanthatwetrytofetchasevenlyaspossibleforallthereferencestrings.Formally,wesaythatthescheduleatastepisfairifthefetchvectorFatthatstepislex-icographicallysmallestamongalln-vectorswiththesameweight.Weformallystatethedefinitionoflexicographicminimumbelow:ConsidertwovectorsF=[f1,f2,···fn]andG=[g1,g2,···,gn],suchthat∑f,ifi=∑foralligi,andthati≥fi+1andgi≥gi+11≤i≤n−1.ThenFislexicographicallysmallerthanGifandonlyifthereissomeindexk,k≥1,suchthatfi=gi,1≤i≤k−1andfgfollowingk 1)LocallyFairAllocation:AchieveafairscheduleateachI/Ostepthatisworkconserving. 2)GloballyFairAllocation:Achieveafairwork-conservingschedulebasedonthecumulativenumbersofblocksac-cessedbyeachstring. 3)WeightedAllocation:Achieveaworkconservingscheduleinwhichthecumu-lativenumberofblocksaccessedafteranystepareinproportiontotherelativeprioritiesofthereferencestrings. A.OverviewoftheAlgorithms Atanytime,thereareasetofcandi-dateblocksfromeachreferencestringthatcanbefetchedinthisI/Ostep.Thesearetheunfetchedblocksinthelookaheadforthatstring.Thesystemismodeledbyan(Vaugmented{s,t},EbipartiteEt resourcegraphG=Es)definedasfollows: • V={D1,D2,···,Dd} {R1,R2,···,Rn}isasetofn+dnodes,oneforeachreferencestringanddiskinthesystem.• Eisthesetofdirectededgesbetweennodesrepresentingreferencestringsandnodesrepresentingdisks:thereisanedge(Ri,Dj)wheneverthereisarequesttodiskDjinthecurrentlookaheadwindowofreferencestringRi. • Distinguishedverticessandtwillserveasthesourceandsinkofpathsthroughthegraph. •Et={(Dj,t),1≤j≤d},isthesetofdirectededgesfromeachdisknodetot.• Es⊆{(s,Ri),1≤i≤n},isasubsetofthedirectededgesfromstostringnodesRi;thissubsetchangesdynamicallyasthealgorithmproceeds. Wewillshowthattheschedulingproblemsdefinedabovecanbemappedtofindingasetofpathsinadynamicallyevolvingresource graph.InitiallytheresourcegraphconsistsofGdefinedabovewithEsbeingempty.Ouralgorithmswillmaintainapriorityvector,P=[p1,p2,···,pn]:piisthepriorityforstringnodeRi.Ateachstepthenodewithhighest-priorityisselected,andanedgefromstothisnodeisaddedtoEs.Thealgorithmthenattemptstofindapathinthecurrentresourcegraphfromstotusingtheedgebetweensandtheselectednode. Ifapathcannotbefoundthroughthecur-rentlyselectednodeRi,thenthenodeismarkedassaturated,andthealgorithmwilladjustthepriorityofRisothatitwillnotbeselectedagain.Asaturatednodemeansthatitisnotpossibletoincreasethetotalnumberofscheduleddisksbymakingafurtherassign-menttothisreferencestring(i.e.theweightofthefetchvectorwillnotincrease).Anassignmenttoasaturatednodewillcomeattheexpenseofreducingtheassignmenttooneofthepreviouslyschedulednodes.Thechoiceofprioritiesissuchthatthisreallocationisundesirable. Ifthesearchforapathissuccessfulthismeansthattheweightofthefetchvectorcanbeincreasedbyassigninganadditionaldisktotheithstring.Thepreviousassignmentofdiskstostringsmightgetchangedtomakethispossible,butthenumberofdisksassignedtoanyotherstringwillnotchangebythisreassignment.Thustheweightofthefetchvectorisincreasedwithoutdisturbingtherelativeallocationsneededforfairness.Theresourcegraphismodifiedtoreflectthenewassignmentsasdescribedbelow.Thisstepwillbereferredtoaspathconditioninginthedescriptionofthealgorithms. PathConditioning:Inthecurrentresourcegrapheveryedge(Ri,Dj)inthediscoveredpathisreplacedbyitsreverseedge(Dj,Ri),andsimilarlyeveryedge(Ds,Rt)inthepathisreplacedbythereverseedge(Rt,Ds).Thelastedgeonthediscoveredpathfromsome Dutotisalsoreplacedbythereverseedge(t,Du). Theinvariantmaintainedbythealgorithmisthatthepresenceofanedge(Ds,Rt)intheresourcegraphatthestartofaniterationmeansthatcurrentlydiskDsisassignedtostringRt.SupposethepathgoingthroughselectednodeRuatthisiterationis(s,Ru,Dj1,Rj1,Dj2,Rj2,···,Djimpliesthatatthestartk,Rjofthek,Djiterationk+1,t).Thistherewerekassignments:diskDjtostringRWheniwasassignedji,1≤i≤k.theedgesbetweenstringanddisknodesarereversed,thek+1newassignmentswillbe:Dj1toRu,DjD2toRj1,andsoonendingwithjk+1assignedtoRjk.NotethatforeachofthestringverticesRjionlytheidentityoftheassigneddiskwaschanged.Notethatalthoughrelatedtotheproblemsofbipartitegraphmatchinganddeterminingmaximumflow[7],therearesubtledifferencesthatprecludedirectapplicationofeitherofthesealgorithmstoourproblem[9].OuralgorithmisinspiredbytheideasofaugmentingpathsandresidualgraphsemployedintheFordFulkersonalgorithmformaximumflowinanetwork,buthasbeensuitablyrefinedfortheproblemathand. III.Localfairness Problemdefinition:Findanassignmentofdiskstoreferencestringssuchthatthefetchvector,F=[b1,b2,...,bn]hasmaximalweightandislexicographicallyminimum. IntheresourcegraphGdefinedearlierletD ˆbethenumberofdisknodesthathaveatleastoneincidentedge.Thenthemaximalweight ofthefetchvectorisD ˆ,i.e.awork-conservingschedulewillassignD ˆdisks.A.LFSalgorithm AstraightforwardalgorithmtoobtainlocalfairnesscantryallassignmentsorderedpartitionsofD ˆmadeupof intoncomponents,inincreasinglexicographicorderuntilafeasiblescheduleisfound.However,therunningtimeofsuchanalgorithmwouldbethereareθ(D ˆnexponentialas )possiblepartitionsofDˆintonparts,wheretheorderofpartsisimportant.ThealgorithmLFSshowninfigure1givesthemaximalweight,lexicographicallymin-imumfetchvectorinO((n+D)|E|)time.Thealgorithmhasthestructuredescribedinoverview.Thepriorityvectorbeginswithallcomponents0;ateveryiterationeithertheweightofthepriorityvectorincreasesby1,oronecomponent(stringnode)issaturated.The(nonsaturated)componentwithsmallestvaluewillhavethehighestpriorityinthevectorP.Theuseofthedynamicgraphcreatedbythepathconditioningstepensuresthattheassignmentsofdisksatsomestagedonotprecludereassignmentatalaterstage.ThisobservationandthepruningofsaturatednodesallowsthenomorethanD ˆalgorithmtoterminatein +niterations;eachiterationrequiresanO(|E|)pathsearchingalgorithm.WepresentsomelemmasrelatedtoLFSal-gorithm(theproofsareomittedduetolackofspace). Lemma1:Onceadisknodehasbeenas-signedtosomestringnodeitwillneverbecomefree(althoughitmaybereassignedtoadifferentstringnode). Lemma2:LFSfindsthecographicalallocationofD ˆminimumlexi-diskblockstoreferencestrings. Lemma3:TherunningtimeofLFSalgorithmisO((n+D)|E|). IV.Global FairnessandQoS Scheduling Thelocalfairnessmetricaimstodistribute diskaccessesfairlyateveryI/Ostep.Therearesituationswheresuchanapproachmaybeinadequate.Ifanapplicationhasaburstofrequeststojustafewdisks,itwillreceivea (1)(2)(3)(4)(5)(6) (7)(8)(9)(10)(11)(12) Fig.1. LFS(LocallyFairScheduling)algorithm: ofblocksaccessed,sothatstringswhichhave weight=0 thelargestslack(orminimumBi)atanystepGisaugmentedresourcegraph,withEs=φaregivenachancetocatchup(instep6).TheLetpriorityvectorP=[0,0,...,0]. Markallelementsofthevectorasnon-saturated.restofthealgorithmremainsthesame.The ˆwhile(weight elementpiofPwithtiesbrokenarbitrarily.Addanedge(s,Ri) A.WeightedallocationFindanypathfromstotthatincludesthe edge(s,Ri). if(nopathisfound) ForQoS,wewanttoassignunequalpro-MarkpiassaturatedinP. portionsofresourcestothetasksneedingelse them.Westudytheproblemofassigningdiskweight=weight+1;pi=pi+1 UpdategraphGbypathconditioningbandwidthinaskewedmannertomultiple O((n+D)|E|)algorithmforlocalfairscheduling smallerfractionofthebandwidth;itmaybe desirabletofavorthesestringsoncethehotspotactivitypasses.Inothercases,asystem-aticalbeitsmalldifferenceinlocalallocationateachstep,mayresultinasignificantspreadofthecumulativebandwidthallocationforlongrunningapplications.Inordertohandlesuchsituationswedefineaglobalfairnesscriterion,anddescribehowwecanadaptourbasicalgorithmforthispurpose.Intuitivelytheglobally-sensitivealgorithmcalledGFS,keepsthehistoryofaccessesinthecumu-lativefetchvectorCF.InthecurrentI/Oassignment,GFSwilltryandequalizethecomponentsofCF,byfavoringstringswhicharelagginginthetotalnumberofblocksfetchedsofar. TheoutlineofthealgorithmisquitesimilartotheLFSalgorithminFigure1.AsinLFS,weighttracksthenumberofdisksallocatedatanystepofthealgorithm.WeletCF=[B1,B2,···,Bn]betheCFvectoratthestartoftheI/Ostep.InitializethepriorityvectorP=[p1,p2,···,pn],wherepi=Bi(instep4):pitracksthetotalnumberofblocksthathavebeenfetchedfortheithstring.Inthemainloop,stringsarescheduledinincreasingorder referencestrings.Lettheproportionatevec-torVp=[v1,v2,···,vn],wherevi=Bi/wiforithstring.Theproblemcanbestatedmoreformallyasfollows: Problemdefinition:Findanassignmentofdiskstoreferencestringsthatmaximizestheweightofthefetchvectorandlexicographi-callyminimizestheproportionatevectorVp=[v1,v2,···,vn]. ThealgorithmforweightedallocationcalledWeAB,keepsthehistoryofaccessesinthecumulativefetchvectorCF.InitializethepriorityvectorP=[p1,p2,···,pn]aspi=Bi/wi.Intuitivelypiindicatetheproportionatebandwidthgiventotheithstringsofar.InthecurrentI/Ostep,WeABwillassigndiskstotryandequalizethecomponentsofP,byfa-voringstringswhicharelaggingintheirpro-portionofblocksfetched.NotethatassigningadisktoRiincreasespiby1/wiinsteadof1.Inthemainloop,stringsarescheduledinde-creasingorderofproportionateslack,sothatstringswhichhavethelargestproportionateslackatanysteparegiventhefirstchancetodecreasetheirslack.Mathematically,thestringwiththeminimumvalueof(Bi+1)/wihasthehighestpriority.Ateachstepeitherastringnodegetssaturatedortheweight ˆdisksareincreasesby1.ItcontinuesuntilD scheduled.Amoreformaldescriptionofthealgorithmisgiveninfigure2.TheWeABalgorithmalsorunsinO((n+D)|E|)time. (1)(2)(3)(4)(5)(6) (7)(8)(9)(10)(11)(12) Fig.2. WeABalgorithm:weight=0 GisaugmentedresourcegraphwithEs=φLetpriorityvectorP=[p1,p2,···,pn],wherepi=Bi/wi.Markallpi’sasnon-saturated ˆwhile(weight Psuchthat(Bi+1)/wiisminimumwithtiesbrokenarbitrarily.Addanedge(s,Ri) Findanypathfromstotthatincludestheedge(s,Ri). if(nopathisfound) MarkpiassaturatedinP.else weight=weight+1;Bi=Bi+1;pi=Bi/wiUpdategraphGbypathconditioning O((n+D)|E|)algorithmforweightedallocation Bandwidth vs Number of IOs, D=64, N=4 0.4 Reference string 1Reference string 2Reference string 3Reference string 4 0.35 Bandwidth allocated 0.3 0.25 0.2 0.15 0.1 0 50 100 150 200IO number 250 300 350 400 Fig.3.GFS:fairallocationfor4strings Bandwidth vs Number of IOs, D=64, N=3 0.5 Reference string 1(0.5)Reference string 2(0.3)Reference string 3(0.2) 0.45 V.Results Weevaluatethealgorithmsforglobalandproportionalallocationbysimulations.Forglobalfairness,weperformedanexperimentwith4referencestrings(generatedrandomly)and64disks.Theamountoflookaheadisequaltonumberofdisksforeachreferencestring.Figure3showstheactualdistributionofbandwidthamongthestrings.Thefigureconfirmsthatthebandwidthisequallydis-tributedamongallfourstrings.Initialfluctu-ationsarerelatedtothedistributionofdiskrequestsinthereferencestringsandtheworkconservingnatureofthealgorithm,thatpre-ventsanydiskfromidlingifsomerequestforthatdiskcanbescheduled.Forweightedallocation,weperformedexperimentswith3randomly-generatedreferencestringsand64disks.StringsR1,R2andR3havebeenassignedrelativeprioritiesof0.5,0.3and0.2respec-tively.Figure4showstheactualbandwidthallocatedtothestringsusingWeABalgo-rithm.TheresultshowninthefigureindicatesthatWeABalgorithmallocatesdifferentiatedbandwidthtovariousstringsaccordingtotheweightsassignedtothem. Wealsoexperimentedwithskewedinput Bandwidth allocated 0.4 0.35 0.3 0.25 0.2 0.15 0 20 40 60 80 100IO number 120 140 160 180 200 Fig.4.WeABwithweights0.5,0.3,0.2 models.Wearecurrentlyextendingtheevalu-ationtoincorporatephysicaldiskparametersandvariableaccesstimes.Amorecompre-hensiveevaluationofthealgorithmswillbepresentedinthefullpaper.Wealsoex-perimentedwithskewedinputmodelsandwewillbepresentingamorecomprehensiveevaluationofalgorithmsinthefullpaper. VI.Relatedwork Severaldiskschedulingpolicieshavebeenproposedforsingleaswellasmultipledisks.Haritsaetal.[10]havepresentedefficientandfairschedulingalgorithmsforasingledisk.Theytrytoachievefairnessatthecylinder levelbyproposingstrategiesfordiskheadmovementafterarequesthasbeenserviced.TheYFQ[4]algorithmwhichisanextensiontogeneralizedprocessorsharing(GPS)modelofproportionalresourceallocation,allowsapplicationstoreserveafixedshareofdiskbandwidth.Cello[21]providesaschedulingframeworkforaheterogeneousmixofappli-cationsaccessingthedisk.Itemploysa2-levelschedulingframework,whererequestsfromapplicationsareputintodifferentqueuesbasedontheirneedsatthefirstlevelandacommonschedulerreorderstherequestsselectedfromthesequeuestominimizeseekandrotationaldelaysatthesecondlevel.Ourapproachhowever,providesbandwidthallocationacrossmultipledisks.Bybatchingrequeststhatbelongtoasinglediskmanyofthelowerleveloptimizationssuggestedintheseworkscanbeemployedalongwithourhigherlevelframework. Lumbetal.presentedFacade[6]thatisanapproachtofulfillServiceLevelObjective(SLO)ofindependentworkloadsaccessingastoragesystem.Facadeprovidesadynamictrade-offbetweentheIOrateandaveragelatencybyreal-timeschedulingandfeedbackbasedcontrolofdiskqueuelengths.Theirap-proachassumestheavailabilityofacapacityplannerforadmissioncontrol.OurapproachprovidesafinercontroloverallocationofdiskstoworkloadsineachI/Otoachievethedesiredbandwidthallocation.Interposedproportionalsharingalgorithmsaresuggestedin[12]thatassignvirtualstartandfinishtimestorequestsbasedontheircost,clientqueueandarrivaltime.In[12],theserverisconsideredasasingleresourcesharedbyalltheclients,whereasweattainfairnessformultipleinputstreamsrequestingmultipleresources(disks). ManyschemeshavebeendevelopedforfairschedulinginInternetrouters.SchemessuchasiSLIP[17],shakeup[8]trytoachieve highthroughputandfairnessbylookingformaximalbipartitematchingbetweeninputsandoutputsateachpointofscheduling.iFS[18]providesfairnessandQoSguaranteesbyassigningvirtualstartandfinishtimestoeachpacket.Thesealgorithmstrytoobtainan1-to-1mappingbetweeninputandoutputportsinsteadof1-to-manymappingrequiredforourproblem.Iterativelyapplyingbipartitematchingdoesnotguaranteeafairscheduleinourcasebecauseitgivesanirrevocablemappingthatcannotbechangedinfutureiterations.Manyotherarchitectureshavebeenproposedfordifferentiatedallocationofband-widthtoflowssuchasdiffservs[19]andQoSBox[5].NoneofthemcanbeusedforQoSinautomatedstoragesystems[23].Asstatedin[23],performanceofstoragesystemsdependsalotonthecurrentstateandstorageprotocolsarenotamenabletopacketdroppingandtrafficshaping. AutomatedtoolssuchasHippodrome[3],Minerva[2]canbeusedtodesignstoragesystems,ifthecapacityrequirementsandworkloadcharacteristicsareknownapriori.Thesetoolsgothroughtheiterativeprocessofdesigningandevaluatingthesystem.Oncetheworkloadsoranyotherrequirementchangesthewholeprocessneedstoberepeatedagain.Ourapproachcanhandleshorttermfluctua-tionsandprovideweighted-QoSeveninthepresenceofunpredictableworkloads. VII.Conclusions InthispaperweexploredschedulingschemestoprovidebandwidthallocationinparallelI/Osystems.LFSalgorithmobtainslocalfairnessateachI/Ostep.GFSalgorithmprovidesfairallocationoverlargertimewindowsbykeepingahistoryofblocksaccessedbyeachstring,andWeABisageneralschemetoobtainanI/Oschedulewithprioritizedallo-cationofdiskbandwidthtovariousstrings.WeABcansupportaneconomicmodelof servicesprovidedbyadatacenterthatneedtoprovidedifferentiatedbandwidthtovar-iouscustomers.Wedemonstratedviasim-ulationsthattheWeABalgorithmmatchesthebandwidthallocatedtovariousapplicationtotheirdesiredpriotiryandGFSobtainsafairschedule.Allthesealgorithmsareworkconservingandprovidehighthroughput.Weplantoimplementouralgorithmsinanactualsystemtogetabetterideaaboutperformanceandscalabilityofourapproachinfuture. Acknowledgments SupportforthefirstauthorwasprovidedbytheNationalScienceFoundationunderGrantCCR-0105565.ResearchtimeduringIPAas-signmentofthesecondauthorwasprovidedbytheNationalScienceFoundationundertheIRDprogram. References [1]S.Albers,N.Garg,andS.Leonardi.Minimizingstall timeinsingleandparalleldisksystems.J.ACM,47(6):969–986,2000. [2]G.A.Alvarezandetal.Minerva:anautomated resourceprovisioningtoolforlarge-scalestoragesys-tems.InACMTransactionsonComputerSystems,pages483–518,November2001. [3]E.Andersonandetal.Hippodrome:runningcircles aroundstorageadministration.InFileandStorageTechnology(FAST’02),pages175–188,January2002.[4]J.Bruno,J.Brustoloni,E.Gabber,B.Ozden,and A.Silberschatz.Diskschedulingwithqualityofserviceguarantees.InProceedingsoftheIEEEInternationalConferenceonMultimediaComputingandSystemsVolumeII-Volume2,page400.IEEEComputerSociety,1999. [5]N.ChristinandJ.Liebeherr.TheQoSbox:APC-RouterforQuantitativeServiceDifferentiationinIPNetworks.Technicalreport,2001. [6]A.M.ChristopherLumbandG.Alvarez.Facade: Virtualstoragedeviceswithperformanceguarantees.FileandStoragetechnologies(FAST’03),pages131–144,March2003. [7]T.H.Cormen,C.E.Leiserson,andR.L.Rivest. IntroductiontoAlgorithms.TheMITPress,pages579–598,1999. [8]M.W.Goudreau,S.G.Kolliopoulos,andS.B. Rao.Schedulingalgorithmsforinput-queuedswitches:randomizedtechniquesandexperimentaleval-uation.IEEEINFOCOM,3:1634–1643,March2000. [9]A.Gulati.SchedulingwithQoSinparallelI/Osystems. Master’sthesisinprogress,RiceUniversity,June2004.[10]J.R.HaritsaandT.Pradhan.Fairdiskschedulers forhighperformancecomputingsystems.Proc.ofInetrnationalConf.onHighPerformanceComputing,December1995. [11]D.A.Hutchinson,P.Sanders,andJ.S.Vitter.Duality betweenprefetchingandqueuedwritingwithparalleldisks.InProceedingsofthe9thAnnualEuropeanSym-posiumonAlgorithms,pages62–73.Springer-Verlag,2001. [12]W.Jin,J.S.Chase,andJ.Kaur.Interposedproportional sharingforastorageserviceutility.SIGMETRICSPerform.Eval.Rev.,32(1):37–48,2004. [13]M.KallahallaandP.Varman.Optimalread-once paralleldiskscheduling.Proc.6thACMWorkshoponI/OinParallelandDistributedSystems(IOPADS’99),April1999. [14]M.KallahallaandP.J.Varman.Optimalprefetching andcachingforparallelI/Osystems.In13thACMSymposiumonParallelAlgorithmsandArchitectures,pages219–228,July2001. [15]T.Kimbrel,P.Cao,E.W.Felten,A.R.Karlin,and K.Li.Integratedparallelprefetchingandcaching.InProceedingsofthe1996ACMSIGMETRICSinter-nationalconferenceonMeasurementandmodelingofcomputersystems,pages262–263.ACMPress,1996.[16]D.KotzandR.Jain.I/Oinparallelanddistributed systems.InA.KentandJ.G.Williams,editors,EncyclopediaofComputerScienceandTechnology,volume40,pages141–154.MarcelDekker,Inc.,1999.[17]N.McKeown.TheiSLIPSchedulingAlgorithmfor Input-QueuedSwitches.IEEE/ACMTransactionsOnNetworking,7:188–201,April1999. [18]N.NiandL.N.Bhuyan.Fairschedulingininternet routers.IEEETransactionsOnComputers,pages686–701,June2002. [19]K.Nichols,S.Blake,F.Baker,andD.Black.Anar-chitecturefordifferentiatedservices.Technicalreport,December1998. [20]R.H.Patterson,G.A.Gibson,E.Ginting,D.Stodolsky, andJ.Zelenka.Informedprefetchingandcaching.InProceedingsofthefifteenthACMsymposiumonOp-eratingsystemsprinciples,pages79–95.ACMPress,1995. [21]P.J.ShenoyandH.M.Vin.Cello:adiskscheduling frameworkfornextgenerationoperatingsystems.InProceedingsofthe1998ACMSIGMETRICSjointin-ternationalconferenceonMeasurementandmodelingofcomputersystems,pages44–55.ACMPress,1998.[22]G.A.S.Whittle,J.-F.Paris,A.Amer,D.D.E.Long, andR.Burns.Usingmultiplepredictorstoimprovetheaccuracyoffileaccesspredictions.In20thIEEEConferenceonMassStorageSystemsandTechnologies,April2003. [23]J.Wilkes.Travelingtorome:Qosspecificationsforau-tomatedstoragesystemmanagement.InInternationalWorkshoponQoS,pages75–91,June2001. 因篇幅问题不能全部显示,请点此查看更多更全内容