throbber
EFFICIENT ORGANIZATION AND ACCESS OF MULTI-DIMENSIONAL
`DATASETS ON TERTIARY STORAGE SYSTEMS
`
`L.T. CHEN" , R. DRACH2 , M. KEATING2 , S. LOUIS2 , D. ROTEM', A. SHOSHANI'
`
`'Lawrence Berkeley Laboratory, Berkeley CA 94720
`
`2 Lawrence Livermore Laboratory, Livermore CA 94550
`
`(Published in Information Systems Journal, April, 1995)
`
`EFFICIENTORGANIZATIONANDACCESSOFMULTI-DIMENSIONAL
`DATASETSONTERTIARYSTORAGESYSTEMS
`L.T.Chen ,R.Drach,M.Keating,S.Louis,D.Rotem ,A.Shoshani
` LawrenceBerkeleyLaboratory,BerkeleyCA 
`LawrenceLivermoreLaboratory,LivermoreCA 
`(PublishedinInformationSystemsJournal,April, )
`Abstract|Thispaperaddressestheproblemofurgentlyneededdatamanagementtechniquesfor
`e(cid:14)cientlyretrievingrequestedsubsetsoflargedatasetsfrommassstoragedevices.Thisproblemis
`especiallycriticalforscienti(cid:12)cinvestigatorswhoneedreadyaccesstothelargevolumeofdatagenerated
`bylarge-scalesupercomputersimulationsandphysicalexperimentsaswellastheautomatedcollection
`ofobservationsbymonitoringdevicesandsatellites.Thisproblemalsonegatesthebene(cid:12)tsoffast
`networks,becausethetimetoaccessasubsetfromalargedatasetstoredonamassstoragesystemis
`muchgreaterthanthetimetotransmitthatsubsetoverafastnetwork.Thispaperfocusesonvery
`largespatialandtemporaldatasetsgeneratedbysimulationofclimatemodels,butthetechniques
`describedhereareapplicabletoanylargemultidimensionalgriddata.Themainrequirementisto
`e(cid:14)cientlyaccessrelevantinformationcontainedwithinmuchlargerdatasetsforanalysisandinteractive
`visualization.Althoughtheseproblemsarenowbecomingmorewidelyrecognized,theproblempersists
`becausetheaccessspeedofroboticstoragedevicescontinuestobethebottleneck.Toaddressthis
`problem,wehavedevelopedalgorithmsforpartitioningtheoriginaldatasetsinto\clusters"based
`onanalysisofdataaccesspatternsandstoragedevicecharacteristics.Further,wehavedesigned
`enhancementstocurrentstorageserverprotocolstopermitcontroloverphysicalplacementofdataon
`storagedevices.Wedescribeinthispapertheapproachwehavetaken,thepartitioningalgorithms,
`andsimulationandexperimentalresultsthatshow toordersofmagnitudeinaccessimprovements
`forpredictedquerytypes.Wefurtherdescribethedesignandimplementationofimprovementstoa
`speci(cid:12)cstoragemanagementsystem,UniTree,whicharenecessarytosupporttheenhancedprotocols.
`Inaddition,wedescribethedevelopmentofapartitioningworkbenchtohelpscientistsselectthe
`preferredsolutions.
` .INTRODUCTION
`Scientistsworkingwithspatio-temporaldatadonotnaturallythinkoftheirdataintermsof
`(cid:12)lesorcollectionsof(cid:12)les,butratherintermsofbasicabstractionssuchasspatialandtemporal
`variables,multidimensionalarrays,andimages.Thisworkisdirectedtowardprovidingsupport
`forsuchabstractionswithinthecontextofcurrenthierarchicalmassstoragesystems.Oneofthe
`mostcriticalissuesforscienti(cid:12)cinvestigatorsistheincreasedvolumeofdatageneratedbylarge-
`scalesupercomputersimulationsandphysicalexperiments.Inaddition,theautomatedcollection
`ofobservationsbymonitoringdevicesandsatellitesproducevastdataatincreasinglyfasterrates.
`Theselargedatasetshave,insomecases,ledtounreasonablylongdelaysindataanalysis.Inthese
`situations,thespeedofsupercomputersisnolongeranissue;instead,itistheabilitytoquickly
`selectsubsetsofinterestfromthelargedatasetsthathasbecomethemajorbottleneck.
`Toaddressthisneed,wehavedevelopedalgorithmsforpartitioningdatasetsinto\clusters"
`basedonanticipateddataaccesspatternsandstoragedevicecharacteristics,aswellasenhance-
`mentstocurrentstorageserverprotocolstopermitcontroloverphysicalplacementofdataon
`storagedevices.Theaccesspatternsconsideredinthispaperarerangespeci(cid:12)cationsinthemul-
`tidimensionalspace.Thetechniquesdevelopedcanbeappliedtoanymultidimensionaldatasets,
`althoughouremphasisandexampleapplicationsisonspatio-temporaldatasets.
`Inordertohaveapracticalandrealisticenvironment,wechoosetofocusondevelopinge(cid:14)-
`cientstorageandretrievalofclimatemodelingdatageneratedbytheProgramforClimateModel
`DiagnosisandIntercomparison(PCMDI).PCMDIwasestablishedatLawrenceLivermoreNa-
`tionalLaboratory(LLNL)tomountasustainedprogramofanalysisandexperimentationwith
`climatemodels,incooperationwiththeinternationalclimatemodelingcommunity[ ].Todate,
`
`
`Scientists working with spatio-temporal data do not naturally think of their data in terms of
`files or collections of files, but rather in terms of basic abstractions such as spatial and temporal
`variables, multidimensional arrays, and images. This work is directed toward providing support
`for such abstractions within the context of current hierarchical mass storage systems. One of the
`most critical issues for scientific investigators is the increased volume of data generated by large-
`scale supercomputer simulations and physical experiments. In addition, the automated collection
`of observations by monitoring devices and satellites produce vast data at increasingly faster rates.
`These large datasets have, in some cases, led to unreasonably long delays in data analysis. In these
`situations, the speed of supercomputers is no longer an issue; instead, it is the ability to quickly
`select subsets of interest from the large datasets that has become the major bottleneck.
`To address this need, we have developed algorithms for partitioning datasets into "clusters"
`based on anticipated data access patterns and storage device characteristics, as well as enhance-
`ments to current storage server protocols to permit control over physical placement of data on
`storage devices. The access patterns considered in this paper are range specifications in the mul-
`tidimensional space. The techniques developed can be applied to any multidimensional datasets,
`although our emphasis and example applications is on spatio-temporal datasets.
`In order to have a practical and realistic environment, we choose to focus on developing effi-
`cient storage and retrieval of climate modeling data generated by the Program for Climate Model
`Diagnosis and Intercomparison (PCMDI). PCMDI was established at Lawrence Livermore Na-
`tional Laboratory (LLNL) to mount a sustained program of analysis and experimentation with
`climate models, in cooperation with the international climate modeling community [1]. To date,
`
` This paper addresses the problem of urgently needed data management techniques for
`Abstract (cid:9)
`efficiently retrieving requested subsets of large datasets from mass storage devices. This problem is
`especially critical for scientific investigators who need ready access to the large volume of data generated
`by large-scale supercomputer simulations and physical experiments as well as the automated collection
`of observations by monitoring devices and satellites. This problem also negates the benefits of fast
`networks, because the time to access a subset from a large dataset stored on a mass storage system is
`much greater than the time to transmit that subset over a fast network. This paper focuses on very
`large spatial and temporal datasets generated by simulation of climate models, but the techniques
`described here are applicable to any large multidimensional grid data. The main requirement is to
`efficiently access relevant information contained within much larger datasets for analysis and interactive
`visualization. Although these problems are now becoming more widely recognized, the problem persists
`because the access speed of robotic storage devices continues to be the bottleneck. To address this
`problem, we have developed algorithms for partitioning the original datasets into "clusters" based
`on analysis of data access patterns and storage device characteristics. Further, we have designed
`enhancements to current storage server protocols to permit control over physical placement of data on
`storage devices. We describe in this paper the approach we have taken, the partitioning algorithms,
`and simulation and experimental results that show 1 to 2 orders of magnitude in access improvements
`for predicted query types. We further describe the design and implementation of improvements to a
`specific storage management system, UniTree, which are necessary to support the enhanced protocols.
`In addition, we describe the development of a partitioning workbench to help scientists select the
`preferred solutions.
`
`1. INTRODUCTION
`
`1
`
`Page 1 of 30
`
`MINDGEEK EXHIBIT 1008
`
`

`

`L.T. CHEN et al.
`
`PCMDI has generated over one terabyte of data, mainly consisting of very large, spatio-temporal,
`multidimensional data arrays.
`The main requirement is to efficiently access relevant information contained within much larger
`datasets for analysis and interactive visualization. Although the initial focus of this work is on
`spatial and temporal data, our results can be applied to other kinds of multidimensional grid
`datasets.
`The developmental and operational site for our work is the National Storage Laboratory, an
`industry-led collaborative project [2] housed in the National Energy Research Supercomputer Cen-
`ter (NERSC) at LLNL. The system integrator for the National Storage Laboratory, IBM Federal
`Sector Division in Houston, has projects already in place that are investigating improved access
`interface and data reorganization techniques for atmospheric modelers at NCAR [3]. Many aspects
`of our work complement the goals of the National Storage Laboratory.
`
`2. BACKGROUND
`
`L.T.Chenetal.
`
`PCMDIhasgeneratedoveroneterabyteofdata,mainlyconsistingofverylarge,spatio-temporal,
`multidimensionaldataarrays.
`Themainrequirementistoe(cid:14)cientlyaccessrelevantinformationcontainedwithinmuchlarger
`datasetsforanalysisandinteractivevisualization.Althoughtheinitialfocusofthisworkison
`spatialandtemporaldata,ourresultscanbeappliedtootherkindsofmultidimensionalgrid
`datasets.ThedevelopmentalandoperationalsiteforourworkistheNationalStorageLaboratory,an
`industry-ledcollaborativeproject[]housedintheNationalEnergyResearchSupercomputerCen-
`ter(NERSC)atLLNL.ThesystemintegratorfortheNationalStorageLaboratory,IBMFederal
`SectorDivisioninHouston,hasprojectsalreadyinplacethatareinvestigatingimprovedaccess
`interfaceanddatareorganizationtechniquesforatmosphericmodelersatNCAR[ ].Manyaspects
`ofourworkcomplementthegoalsoftheNationalStorageLaboratory.
`.BACKGROUND
`Large-scalescienti(cid:12)csimulations,experiments,andobservationalprojects,generatelargemul-
`tidimensionaldatasetsandthenstorethemtemporarilyorpermanentlyinanarchivalmassstorage
`systemuntilitisrequiredtoretrievethemforanalysisorvisualizationasshowninFigure .For
`example,asingledataset(usuallyacollectionoftime-historyoutput)fromaclimatemodelsim-
`ulationmayproducefromonetotwentygigabytesofdata.Typically,thisdatasetisstoredon
`uptoonehundredmagnetictapes,cartridges,oropticaldisks(currentIBM tapecartridge
`technology,usedinthestoragesystemsatLLNL,allows-megabytespercartridge).These
`kindsoftertiarydevices(i.e.,onelevelbelowmagneticdisk),evenifroboticallycontrolled,are
`relativelyslow.Takingintoaccountthetimeittakestoload,search,read,rewind,andunload
`alargenumberofcartridges,itcantakemanyhourstoretrieveasubsetofinterestfromalarge
`dataset.Thisine(cid:14)ciencygenerallyrequiresthattheentiresetoforiginaldataberetrievedanddown-
`loadedtoadiskcachefortheresearchertoanalyzeorinteractivelyvisualizeasubsetofthedata.
`Futurehardwaretechnologydevelopmentswillcertainlyhelpthesituation.Datatransferratesare
`likelytoincreasebyasmuchasanorderofmagnitudeaswilltapeandcartridgecapacities.How-
`ever,newsupercomputersandmassivelyparallelprocessortechnologieswilloutstripthiscapacity
`byallowingscientiststocalculateever(cid:12)nerresolutionsandmoretimesteps,andthusgenerating
`muchmoredata.Becausemostofthedatageneratedbymodelsandexperimentswillstillbe
`requiredtoresideontertiarydevices,andbecauseitwillusuallybethecasethatonlyasubsetof
`thatdataisofimmediateinterest,e(cid:11)ectivemanagementofverylargescienti(cid:12)cdatasetswillbean
`ongoingconcern.
`Asimilarsituationexistswithmanyscienti(cid:12)capplicationareas.Forexample,theEarthOb-
`servingSystem(EOS)currentlybeingdevelopedbyNASA[ ,],isexpectedtoproduceverylarge
`datasets( sofgigabyteseach).Thetotalamountofdatathatwillbegeneratedisexpectedto
`reachseveralpetabytes,andthuswillresidemostlyontertiarystoragedevices.Suchdatasetsare
`usuallyabstractedintosocalled\browsesets"thataresmallenoughtobestoredondisk(using
`coarsergranularityand/orsummarization,suchasmonthlyaverages).Userstypicallyexplorethe
`browsesetsat(cid:12)rst,andeventuallyfocusonasubsetofthedatasettheyareinterestedin.We
`addressherethislaststepofextractingthedesiredsubsetsfromdatasetsthatarelargeenoughto
`betypicallystoredontape.
`Itisnotrealistictoexpectcommercialdatabasesystemstoadde(cid:14)cientsupportforvarious
`typesoftertiarystoragesoon.Butevenifsuchcapabilitiesexisted,weadvocateanapproach
`thatthemassstorageserviceshouldbeoutsidethedatamanagementsystem,andthatvarious
`softwaresystems(includingfuturedatamanagementsystems)willinterfacetothisservicethrough
`astandardizedprotocol.TheIEEEisactivelypursuingsuchstandardprotocols[]andmany
`commerciallyavailablestoragesystemvendorshavestatedthattheywillhelpdevelopandsupport
`thisstandardse(cid:11)ortforavarietyoftertiarydevices.Anotheradvantagetoourapproachisthat
`existingsoftwareapplications,suchasanalysisandvisualizationsoftware,caninterfacedirectlyto
`themassstorageservice.Fore(cid:14)ciencyreasons,manyapplicationsusespecializedinternaldata
`
`Large-scale scientific simulations, experiments, and observational projects, generate large mul-
`tidimensional datasets and then store them temporarily or permanently in an archival mass storage
`system until it is required to retrieve them for analysis or visualization as shown in Figure 1. For
`example, a single dataset (usually a collection of time-history output) from a climate model sim-
`ulation may produce from one to twenty gigabytes of data. Typically, this dataset is stored on
`up to one hundred magnetic tapes, cartridges, or optical disks (current IBM 3480 tape cartridge
`technology, used in the storage systems at LLNL, allows 200-250 megabytes per cartridge). These
`kinds of tertiary devices (i.e., one level below magnetic disk), even if robotically controlled, are
`relatively slow. Taking into account the time it takes to load, search, read, rewind, and unload
`a large number of cartridges, it can take many hours to retrieve a subset of interest from a large
`dataset.
`This inefficiency generally requires that the entire set of original data be retrieved and down-
`loaded to a disk cache for the researcher to analyze or interactively visualize a subset of the data.
`Future hardware technology developments will certainly help the situation. Data transfer rates are
`likely to increase by as much as an order of magnitude as will tape and cartridge capacities. How-
`ever, new supercomputers and massively parallel processor technologies will outstrip this capacity
`by allowing scientists to calculate ever finer resolutions and more time steps, and thus generating
`much more data. Because most of the data generated by models and experiments will still be
`required to reside on tertiary devices, and because it will usually be the case that only a subset of
`that data is of immediate interest, effective management of very large scientific datasets will be an
`ongoing concern.
`A similar situation exists with many scientific application areas. For example, the Earth Ob-
`serving System (EOS) currently being developed by NASA [3,4], is expected to produce very large
`datasets (100s of gigabytes each). The total amount of data that will be generated is expected to
`reach several petabytes, and thus will reside mostly on tertiary storage devices. Such datasets are
`usually abstracted into so called "browse sets" that are small enough to be stored on disk (using
`coarser granularity and/or summarization, such as monthly averages). Users typically explore the
`browse sets at first, and eventually focus on a subset of the dataset they are interested in. We
`address here this last step of extracting the desired subsets from datasets that are large enough to
`be typically stored on tape.
`It is not realistic to expect commercial database systems to add efficient support for various
`types of tertiary storage soon. But even if such capabilities existed, we advocate an approach
`that the mass storage service should be outside the data management system, and that various
`software systems (including future data management systems) will interface to this service through
`a standardized protocol. The IEEE is actively pursuing such standard protocols [6] and many
`commercially available storage system vendors have stated that they will help develop and support
`this standards effort for a variety of tertiary devices. Another advantage to our approach is that
`existing software applications, such as analysis and visualization software, can interface directly to
`the mass storage service. For efficiency reasons, many applications use specialized internal data
`
`Page 2 of 30
`
`MINDGEEK EXHIBIT 1008
`
`

`

`
`
`AccessofMulti-DimensionalDatasetsonTertiaryStorageSystems
`
`Access of Multi-Dimensional Datasets on Tertiary Storage Systems
`
`Simulation/Experimental System
`Simulation/Experimental System
`
`A
`
`B
`B
`
`C
`
`A multi-dimensional
`A multi-dimensional
`dataset from models
`dataset from models
`or observations
`or observations
`may be stored over
`may be stored over
`multiple volumes
`multiple volumes
`
`Problem:
`Problem:
`Data not optimally stored for:
`Data not optimally stored for:
`• Access patterns
` Access patterns
`• Storage device
` Storage device
`characteristics
` characteristics
`
`Data Storage
`Data Storage
`
`Archival Mass Storage System
`Archival Mass Storage System
`
`@e G
`
`A multi-volume dataset
`A multi-volume dataset
`(typically 1-20 GBytes
`(typically 1-20 GBytes
`stored on 5-100 tapes)
`stored on 5-100 tapes)
`
`B
`
`A
`
`. . .
`
`• • •
`
`C
`
`Fig. :CurrentSituation
`formatsandoftenprefertointerfaceto(cid:12)lesdirectlyratherthanuseadatamanagementsystem.
` .APPROACH
`Asmentionedabove,themainproblemweaddressistheslowaccessofsmallsubsetsfroma
`largedatasetinarchivalstorageneededforvisualizationandanalysis.AscanbeseeninFigure
` ,thisproblemhasastorageorganizationcomponentandanaccesscomponent.Naturally,the
`dataaccessdependsonthemethodusedfortheinitialstorageofthisdataset.Becauseadataset
`istypicallystoredontertiarystoragesystemsintheorderitisproducedandnotbytheorder
`inwhichitwillberetrieved,alargeportionofthedatasetneedstobereadinordertoextract
`thedesiredsubset.Thisleadstolongdelays( minutestoseveralhoursiscommon)depending
`onthesizeofthedataset,thespeedofthedeviceused,andtheusageloadonthemassstorage
`system.WeshowschematicallyinFigure thatthedesiredsubset(whichconsistsofpiecesA,B,
`Cwhichbelongtoasingledataset)isscatteredovermultiplevolumesoftheMassStorageSystem.
`Weaddresstheaboveproblembydevelopingalgorithmsandsoftwarethatfacilitatetheparti-
`tioningofalargedatasetintomultiple\clusters"thatre(cid:13)ecttheirexpectedaccess.Forexample,
`ifmanydesiredsubsetsconsistofcertainvariablesoveraperiodofayear,thenreorganizingand
`
`Problem:
`Problem:
`Access time too long due to:
`Access time too long due to:
`• Drive contention
` Drive contention
`• Seek delays
` Seek delays
`• Loading/unloading
` Loading/unloading
`• Concurrent access locks
` Concurrent access locks
`
`Data Access
`Data Access
`
`V
`
`Visualization/Analysis System
`Visualization/Analysis System
`
`Desired subsets for
`Desired subsets for
`analysis retrieved from
`analysis retrieved from
`the original dataset
`the original dataset
`
`A
`B
`C
`
`A BC
`
`Fig. 1: Current Situation
`
`formats and often prefer to interface to files directly rather than use a data management system.
`
`3. APPROACH
`
`As mentioned above, the main problem we address is the slow access of small subsets from a
`large dataset in archival storage needed for visualization and analysis. As can be seen in Figure
`1, this problem has a storage organization component and an access component. Naturally, the
`data access depends on the method used for the initial storage of this dataset. Because a dataset
`is typically stored on tertiary storage systems in the order it is produced and not by the order
`in which it will be retrieved, a large portion of the dataset needs to be read in order to extract
`the desired subset. This leads to long delays (30 minutes to several hours is common) depending
`on the size of the dataset, the speed of the device used, and the usage load on the mass storage
`system. We show schematically in Figure 1 that the desired subset (which consists of pieces A, B,
`C which belong to a single dataset) is scattered over multiple volumes of the Mass Storage System.
`We address the above problem by developing algorithms and software that facilitate the parti-
`tioning of a large dataset into multiple "clusters" that reflect their expected access. For example,
`if many desired subsets consist of certain variables over a period of a year, then reorganizing and
`
`Page 3 of 30
`
`MINDGEEK EXHIBIT 1008
`
`

`

`4 (cid:9)
`
`L.T. CHEN et al.
`
`L.T.Chenetal.
`
`partitioningthedatasuchthatthecorrespondingvariablesarestoredas\yearlyclusters"incon-
`tiguousstoragelocationswillfacilitatee(cid:14)cientlyreadingthedesireddata.Ingeneral,theportions
`ofadatasetthatsatisfyaquerymaybescatteredoverdi(cid:11)erentpartsofthedataset,orevenon
`multiplevolumes.Forexample,typicalclimatesimulationprogramsgeneratemultiple(cid:12)les,each
`foraperiodofdaysforallvariablesofthedataset.Thus,foraquerythatrequestsasingle
`variable(say\precipitation")foraspeci(cid:12)cmonthatgroundlevel,therelevantpartsofthedataset
`resideon(cid:12)les(eachforadayperiod).These(cid:12)lesmaybestoredonmultiplevolumes.Further,
`onlyasubsetofeach(cid:12)leisneededsinceweareonlyinterestedinasinglevariableandonlyat
`groundlevel.Ifwecollectedallthepartsrelevanttoaqueryandputthemintoasingle(cid:12)le,then
`wewouldhavetheidealclusterforthatquery.Ofcourse,theproblemisoneofstrikingabalance
`betweentherequirementsofallqueries,anddesigningclustersthatwillbeascloseaspossibleto
`theidealclusterofeachquery.
`Theterm\partitioningalgorithm"isusedtoindicatethatasaresultofthealgorithmadataset
`willbepartitioned(orrestructured)intomanysuchclusters.Theterm\cluster"isusedtoconvey
`theideathatallthedatathatsatisfyaqueryshouldideallyresideinasinglecluster.Thegoalis
`tominimizetheamountofstoragethathastobereadwhenasubsetofthedataisneeded.
`Thewaythattheabovetechniquesinteractwiththeexistingsoftwareisshownschematically
`inFigure.ThesamebasicsystemcomponentsshowninFigure alsoexistinFigure,along
`withadditionalcomponents.Thecomponentlabeled\DataAllocationandStorageManagement"
`isresponsiblefordetermininghowtoreorganizeadatasetintomultiple\clusters",andforwriting
`theclustersintothemassstoragesysteminthedesiredorder.Thepartsofthedatasetthatgointo
`asingleclustermaybeoriginallystoredinasingle(cid:12)leorinmultiple(cid:12)les(asmentionedabove,a
`typicalclimatemodelingdatasetisstoredinmultiple(cid:12)les,eachcontainingdaysworthofdata).
`Thecomponentlabeled\DataAssemblyandAccessManagement"isresponsibleforaccessingthe
`clusters,andforassemblingthedesiredsubsetfromclusters(ratherthanreadingthedataset).
`Oneconsequenceofthiscomponentisthatanalysisandvisualizationprogramsarehandedthe
`desiredsubset,andnolongerneedtoperformtheextractionofthesubsetfromthe(cid:12)le.Note
`thattheschematicillustrationintheArchivalMassStorageSystemisintendedtoshowthatthe
`desiredcluster\ABC"maybestoredincontiguousstoragespacefore(cid:14)ciencyasaresultofthe
`allocationanalysis.ThedetailsofthetwocomponentsareshowninFigures and.
`OntheleftofFigure ,theDataAllocationAnalyzerisshown.Itacceptsspeci(cid:12)cationsof
`accesspatternsforanalysisandvisualizationprograms,andparametersdescribingthearchival
`storagedevicecharacteristics.ThismoduleselectsanoptimalsolutionandproducesanAllocation
`Directorythatdescribeshowthemultidimensionaldatasetsshouldbepartitionedandstored.
`TheAllocationDirectoryisusedbytheFilePartitioningModule.Thismoduleacceptsa
`multidimensionaldataset,andreorganizesitinto\clusters"thatmaybestoredinconsecutive
`archivalstorageallocationspacesbythemassstoragesystem.Theresultingclustersarepassed
`ontotheStorageManagerWriteProcess.
`InorderfortheStorageManagerWriteProcessto
`havecontroloverthephysicalplacementofclustersonthemassstoragesystem,enhancementsto
`theprotocolthatde(cid:12)nestheinterfacetothearchivalmassstoragesystemweredeveloped.Unlike
`mostcurrentimplementationsthatdonotpermitcontroloverthedirectphysicalplacementof
`dataonarchivalstorage,theenhancedprotocolpermitsforcingof\clusters"tobeplacedadjacent
`toeachothersothatreadingadjacent\clusters"canbehandledmoree(cid:14)ciently.Accordingly,
`thesoftwareimplementingthemassstoragesystem'sbit(cid:12)leserverandstorageservers,needstobe
`enhancedaswell.Moredetailsonthemodi(cid:12)edprotocolsaregivenSection.
`InFigure,weshowthedetailsofreadingsubsetsfromthemassstoragesystem.Uponrequest
`foradatasubset,theStorageManagerReadProcessusestheAllocationDirectorytodetermine
`the\clusters"thatneedtoberetrieved.Thus,readingoflarge(cid:12)lesforeachsubsetcanbeavoided.
`Hereagain,thebit(cid:12)leserverandstorageserverofthemassstoragesystemneedstobeextended
`tosupportenhancedreadprotocols.Oncetheclustersarereadfromthemassstoragesystem,
`theyarepassedontotheSubsetAssemblyModule.Ideally,therequesteddatasubsetresides
`inasinglecluster(especiallyforqueriesthathavebeenfavoredbythepartitioningalgorithm).
`But,ingeneral,multipleclusterswillhavetoberetrievedtosatisfyasubsetrequest,whereonly
`partofeachclustermaybeneeded.Still,thetotalamountofdatareadwilltypicallybemuch
`
`partitioning the data such that the corresponding variables are stored as "yearly clusters" in con-
`tiguous storage locations will facilitate efficiently reading the desired data. In general, the portions
`of a dataset that satisfy a query may be scattered over different parts of the dataset, or even on
`multiple volumes. For example, typical climate simulation programs generate multiple files, each
`for a period of 5 days for all variables of the dataset. Thus, for a query that requests a single
`variable (say "precipitation") for a specific month at ground level, the relevant parts of the dataset
`reside on 6 files (each for a 5 day period). These files may be stored on multiple volumes. Further,
`only a subset of each file is needed since we are only interested in a single variable and only at
`ground level. If we collected all the parts relevant to a query and put them into a single file, then
`we would have the ideal cluster for that query. Of course, the problem is one of striking a balance
`between the requirements of all queries, and designing clusters that will be as close as possible to
`the ideal cluster of each query.
`The term "partitioning algorithm" is used to indicate that as a result of the algorithm a dataset
`will be partitioned (or restructured) into many such clusters. The term "cluster" is used to convey
`the idea that all the data that satisfy a query should ideally reside in a single cluster. The goal is
`to minimize the amount of storage that has to be read when a subset of the data is needed.
`The way that the above techniques interact with the existing software is shown schematically
`in Figure 2. The same basic system components shown in Figure 1 also exist in Figure 2, along
`with additional components. The component labeled "Data Allocation and Storage Management"
`is responsible for determining how to reorganize a dataset into multiple "clusters", and for writing
`the clusters into the mass storage system in the desired order. The parts of the dataset that go into
`a single cluster may be originally stored in a single file or in multiple files (as mentioned above, a
`typical climate modeling dataset is stored in multiple files, each containing 5 days worth of data).
`The component labeled "Data Assembly and Access Management" is responsible for accessing the
`clusters, and for assembling the desired subset from clusters (rather than reading the dataset).
`One consequence of this component is that analysis and visualization programs are handed the
`desired subset, and no longer need to perform the extraction of the subset from the file. Note
`that the schematic illustration in the Archival Mass Storage System is intended to show that the
`desired cluster "ABC" may be stored in contiguous storage space for efficiency as a result of the
`allocation analysis. The details of the two components are shown in Figures 3 and 4.
`On the left of Figure 3, the Data Allocation Analyzer is shown. It accepts specifications of
`access patterns for analysis and visualization programs, and parameters describing the archival
`storage device characteristics. This module selects an optimal solution and produces an Allocation
`Directory that describes how the multidimensional datasets should be partitioned and stored.
`The Allocation Directory is used by the File Partitioning Module. This module accepts a
`multidimensional dataset, and reorganizes it into "clusters" that may be stored in consecutive
`archival storage allocation spaces by the mass storage system. The resulting clusters are passed
`on to the Storage Manager Write Process. In order for the Storage Manager Write Process to
`have control over the physical placement of clusters on the mass storage system, enhancements to
`the protocol that defines the interface to the archival mass storage system were developed. Unlike
`most current implementations that do not permit control over the direct physical placement of
`data on archival storage, the enhanced protocol permits forcing of "clusters" to be placed adjacent
`to each other so that reading adjacent "clusters" can be handled more efficiently. Accordingly,
`the software implementing the mass storage system's bitfile server and storage servers, needs to be
`enhanced as well. More details on the modified protocols are given Section 7.
`In Figure 4, we show the details of reading subsets from the mass storage system. Upon request
`for a data subset, the Storage Manager Read Process uses the Allocation Directory to determine
`the "clusters" that need to be retrieved. Thus, reading of large files for each subset can be avoided.
`Here again, the bitfile server and storage server of the mass storage system needs to be extended
`to support enhanced read protocols. Once the clusters are read from the mass storage system,
`they are passed on to the Subset Assembly Module. Ideally, the requested data subset resides
`in a single cluster (especially for queries that have been favored by the partitioning algorithm).
`But, in general, multiple clusters will have to be retrieved to satisfy a subset request, where only
`part of each cluster may be needed. Still, the total amount of data read will typically be much
`
`Page 4 of 30
`
`MINDGEEK EXHIBIT 1008
`
`

`

`AccessofMulti-DimensionalDatasetsonTertiaryStorageSystems
`
`Access of Multi-Dimensional Datasets on Tertiary Storage Systems (cid:9)
`
`Simulation/Experimental System
`
`A
`
`B
`
`C
`
`A multi-dimensional
`A multi-dimensional
`dataset from models
`dataset from models
`or observations
`or observations
`may be stored over
`may be stored over
`multiple volumes
`multiple volumes
`
`(See Figure 3 for details)
`(See Figure 3 for details)
`
`
`
`5
`
`Goal:
`Goal:
`Cluster data
`Cluster data
`according to
`according to
`access patterns
`access patterns
`
`Data Allocation and
`Data Allocation and
`Storage Management
`Storage Management
`
`Archival Mass Storage System
`
`ABC
`
`. . .
`
`Frequently accessed
`Frequently accessed
`subsets partitioned
`subsets partitioned
`into clusters
`into clusters
`
`Fig.:AreasofImprovements
`smallerthantheentiredataset.TheSubsetAssemblyModuleisresponsibleforacceptingmultiple
`clusters,selectingtheappropriatepartsfromeach,assemblingthepartsselectedintoasingle
`multidimensionalsubset,andpassingtheresulttotheanalysisandvisualizationprograms.
`Next,wediscusssomedetailsofthepartitioningandsubsetassemblyprocesses,aswellasthe
`managementoftheallocationdirectoryandassociatedmetadata.
` . .ThePartitioningProcess
`Thecharacterizationofaccesspatternsoftheanalysisandvisualizationprogramsisessentialfor
`theorganizationofdatatoachievehighaccesse(cid:14)ciency.Ofcourse,theremaybecon(cid:13)ictingaccess
`patterns.Thus,ananalysisoftheaccesspatternsisneededtodeterminetheoptimalpartitioning
`andallocationofclustersonarchivalstorage.Thepartitioningalgorithmsuseamodeloftheaccess
`patternsaswellasamodelofthephysicaldevicecharacteristics.Thespeci(cid:12)ctechniquesusedfor
`determiningtheoptimalallocationaregiveninSection.
`Inanenvironmentoftypicalmassstoragesystemswe(cid:12)ndamulti-levelhierarchyconsisting
`ofmemory,magneticdisks,androboticdevicesfortapesandopticaldisks.Eachlevelinthe
`hierarchymayserveasacacheforthenextlevel.Asaby-productofourpartitioningalgorithms,
`
`(See Figure 4 for details)
`(See Figur

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket