Products & Services

A framework for system-level modeling and simulation of embedded systems architecture

Description
A framework for system-level modeling and simulation of embedded systems architecture
Published
of 13
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  Hindawi Publishing CorporationEURASIP Journal on Embedded SystemsVolume 2007, Article ID 82123,11pagesdoi:10.1155/2007/82123 ResearchArticle AFrameworkforSystem-LevelModelingandSimulationof EmbeddedSystemsArchitectures CagkanErbas,AndyD.Pimentel,MarkThompson,andSimonPolstra Computer Systems Architecture Group, Informatics Institute, Faculty of Science, University of Amsterdam,Kruislaan 403, SJ Amsterdam, The Netherlands Received 31 May 2006; Revised 7 December 2006; Accepted 18 June 2007Recommended by Antonio NunezThe high complexity of modern embedded systems impels designers of such systems to model and simulate system componentsand their interactions in the early design stages. It is therefore essential to develop good tools for exploring a wide range of designchoices at these early stages, where the design space is very large. This paper provides an overview of our system-level modeling andsimulation environment, Sesame, which aims at e ffi cient design space exploration of embedded multimedia system architectures.Taking Sesame as a basis, we discuss many important key concepts in early systems evaluation, such as Y-chart-based systemsmodeling, design space pruning and exploration, trace-driven cosimulation, and model calibration.Copyright © 2007 Cagkan Erbas et al. This is an open access article distributed under the Creative Commons Attribution License,which permits unrestricted use, distribution, and reproduction in any medium, provided the srcinal work is properly cited. 1. INTRODUCTION The ever increasing complexity of modern embedded sys-tems has led to the emergence of system-level design [1].High-level modeling and simulation, which allows for cap-turing the behavior of system components and their interac-tions at a high level of abstraction, plays a key role in system-level design. Because high-level models usually require lessmodeling e ff  ort and execute faster, they are especially wellsuited for the early design stages, where the design space isvery large. Early exploration of the design space is critical,because early design choices have eminent e ff  ect on the suc-cess of the final product.The traditional practice for embedded systems perfor-mance evaluation often combines two types of simulators,one for simulating the programmable components run-ning the software and one for the dedicated hardware part.For simulating the software part, instruction-level or cycle-accurate simulators are commonly used. The hardware partsare usually simulated using hardware RTL descriptions re-alized in VHDL or Verilog. However, using such a hard-ware/software cosimulation environment during the early design stages has major drawbacks: (i) it requires too muche ff  ort to build them, (ii) they are often too slow for ex-haustive explorations, and (iii) they are inflexible in evalu-ating di ff  erent hardware/software partitionings. Because anexplicit distinction is made between hardware and softwaresimulation, a complete new system model might be requiredfor the assessment of each hardware/software partitioning.To overcome these shortcomings, a number of high-levelmodeling and simulation environments have been proposed[2–5]. These recent environments break o ff  from low-levelsystem specifications, and define separate high-level specifi-cations for behavior (what the system should do) and archi-tecture (how it does it).This paper provides an overview of the high-level mod-eling and simulation methods as employed in embeddedsystems design, focusing on our Sesame framework in par-ticular. The Sesame environment primarily focuses on themultimedia application domain to e ffi ciently prune andexplore the design space of target platform architectures.Section 2introduces the conceptual view of Sesame by dis-cussing several design issues regarding the modeling andsimulation techniques employed within the framework.Section 3summarizes the design space pruning stage whichis performed before cosimulation in Sesame.Section 4dis-cusses the cosimulation framework itself from a softwaredesign and implementation point of view.Section 5ad-dresses the calibration of system-level simulation models. InSection 6,we report experimental results achieved using theSesame framework.Section 7discusses related work. Finally,Section 8concludes the paper.  2 EURASIP Journal on Embedded Systems Processor 1 Processor 2BCAMemory ApplicationmodelArchitecturemodelBusFIFOEventtrace(a)Processor 1 Processor 2BCAMemory ApplicationmodelArchitecturemodelMappinglayerKahn processnetwork withC/C++ processesObjects withinthe sametime domainBusFIFOEventtraceVP-A VP-BVP-C123Bu ff  er(b) Figure 1: (a) Mapping an application model onto an architecturemodel. An event-trace queue dispatches application events froma Kahn process towards the architecture model component ontowhich it is mapped. (b) Sesame’s three-layered structure: applica-tion model layer, architecture model layer, and the mapping layerwhich is an interface between application and architecture models. 2. THESESAMEAPPROACH The Sesame modeling and simulation environment facili-tates performance analysis of embedded media systems ar-chitectures according to the Y-chart design principle [6,7]. This means that Sesame decouples application form archi-tecture by recognizing two distinct models for them. Accord-ing to the Y-chart approach, an application model—derivedfrom a target application domain—describes the functionalbehavior of an application in an architecture-independentmanner. The application model is often used to study a tar-get application and obtain rough estimations of its perfor-mance needs, for example, to identify computationally ex-pensive tasks. This model correctly expresses the functionalbehavior, but is free from architectural issues, such as tim-ing characteristics, resource utilization, or bandwidth con-straints. Next, a platform architecture model—defined withthe application domain in mind—defines architecture re-sources and captures their performance constraints. Finally,an explicit mapping step maps an application model ontoan architecture model for cosimulation, after which the sys-tem performance can be evaluated quantitatively. This is de-picted inFigure 1(a). The performance results may inspirethe system designer to improve the architecture, modify theapplication, or change the projected mapping. Hence, the Y-chart modeling methodology relies on independent applica-tion and architecture models in order to promote their reuseto the greatest conceivable extent.For application modeling, Sesame uses the Kahn pro-cess network (KPN) [8] model of computation in whichparallel processes—implemented in a high-level language—communicate with each other via unbounded FIFO chan-nels. Hence, the KPN model unveils the inherent task-levelparallelism available in the application and makes the com-munication explicit. Furthermore, the code of each Kahnprocess is instrumented with annotations describing the ap-plication’s computational actions, which allows to capturethe computational behavior of an application. The read-ing from and writing to FIFO channels represent the com-munication behavior of a process within the applicationmodel. When the Kahn model is executed, each processrecords its computational and communication actions, andthus generates a trace of  application events . These applicationevents represent the application tasks to be performed andare necessary for driving an architecture model. Applicationevents are generally coarse grained, such as read(channel id, pixel block) or execute(DCT) . Parallelizing applications . The KPN applications of Sesame are obtained by  automatically  converting a sequen-tial specification (C/C++) using the KPNgen tool [9]. Thisconversion is fast and correct by construction. As inputKPNgen accepts sequential applications specified as statica ffi ne nested loop programs, onto which as a first step itapplies a number of source-level transformations to adjustthe amount of parallelism in the final KPN, the C/C++ codeis transformed into single assigment code (SAC), which re-sembles the dependence graph (DG) of the srcinal nestedloop program. Hereafter, the SAC is converted to a polyhe-dral reduced dependency graph (PRDG) data structure, be-ing a compact representation of a DG in terms of polyhedra.In the final step, a PRDG is converted into a KPN by associat-ing a KPN process with each node in the PRDG. The parallelKahn processes communicate with each other according tothe data dependencies given in the DG. Further informationon KPN generation can be found in [9,10]. An architecture model simulates the performance con-sequences of the computation and communication eventsgenerated by an application model. It solely accounts forarchitectural (performance) constraints and does not needto model functional behavior. This is possible because thefunctional behavior is already captured by the applicationmodel, which drives the architecture simulation. The tim-ing consequences of application events are simulated by   Cagkan Erbas et al. 3parameterizing each architecture model component with atable of operation latencies. The table entries could include,for example, the latency of an execute(DCT) event, or thelatency of a memory access in the case of a memory com-ponent. This trace-driven cosimulation of application andarchitecture models allows to, for example, quickly evaluatedi ff  erent hardware/software partitionings by just altering thelatency parameters of architecture model components (i.e.,a low latency refers to a hardware implementation (compu-tation) or on-chip memory access (communication), whilea high latency models a software implementation or access-ing an o ff  -chip memory). With respect to communication,issues such as synchronization and contention on the sharedresources are also captured in the architectural modeling.To realize trace-driven cosimulation of application andarchitecture models, Sesame has an intermediate mappinglayer. This layer consists of virtual processor components,which are the representation of application processes at thearchitecture level, and FIFO bu ff  ers for communication be-tween the virtual processors. As shown inFigure 1(b), thereis a one-to-one relationship between the Kahn processes andchannels in the application model and the virtual proces-sors and bu ff  ers in the mapping layer. The only di ff  erence isthat the bu ff  ers in the mapping layer are limited in size, andtheir size depends on the modeled architecture. The map-ping layer, in fact, has three functions [2]. First, it controlsthe mapping of Kahn processes (i.e., their event traces) ontoarchitecture model components by dispatching applicationevents to the correct architecture model component. Second,it makes sure that no communication deadlocks occur whenmultiple Kahn processes are mapped onto a single architec-ture model component. In this case, the dispatch mecha-nism also provides various strategies for application eventscheduling. Finally, the mapping layer is capable of dynami-cally transforming application events into lower-level archi-tecture events in order to realize flexible refinement of archi-tecture models [2,11]. The output of system simulations in Sesame provides thedesigner with performance estimates of the system(s) understudy together with statistical information such as utilizationof architecture model components (idle/busy times), the de-gree of contention in a system, profiling information (timespent in di ff  erent executions), critical path analysis, and av-erage bandwidth between architecture components. Thesehigh-level simulations allow for early evaluation of di ff  erentdesign choices. Moreover, they can also be useful for identi-fying trends in the systems’ behavior, and help reveal designflaws/bottlenecks early in the design cycle.Despite of being an e ff  ective and e ffi cient performanceevaluation technique, high-level simulation would still fail toexplore large parts of the design space. This is because eachsystem simulation only evaluates a single design point in themaximaldesignspaceoftheearlydesignstages.Thus,itisex-tremely important that some direction is provided to the de-signer as a guidance toward promising system architectures.Analytical methods may be of great help here, as they canbe utilized to identify a small set of promising candidates.The designer then can focus only on this small set, for whichsimulation models can be constructed at multiple levels of abstraction. The process of trimming down an exponentialdesign space to some finite set is called design space pruning  .In the next section, we briefly discuss how Sesame prunes thedesign space by making use of analytical modeling and mul-tiobjective evolutionary algorithms [12]. 3. DESIGNSPACEPRUNING As already mentioned in the previous section, Sesame sup-ports separate application and architecture models within itsexploration framework. This separation implies an explicitmapping step for cosimulation of the two models. Since theenumeration of all possible mappings grows exponentially, adesigner usually needs a subset of best candidate mappingsfor further evaluation in terms of cosimulation. Therefore,in summary, the mapping problem in Sesame is the optimalmapping of an application model onto a (platform) architec-ture model. The problem formulation in Sesame takes threeobjectives into account[12]: maximum processing time in the system, total power consumption of the system, andthe cost of the architecture. This section aims at giving anoverview of the formulation of the mapping problem whichallows us to quickly search for promising candidate systemarchitectures with respect to the above three objectives.  Applicationmodeling The application models in Sesame are process networkswhich can be represented by a graph AP = ( V  K  , E K  ), wherethe sets V  K  and E K  refer to the nodes (i.e., processes) and thedirected channels between these nodes, respectively. For eachnode in the application model, a computation requirement(workload imposed by the node onto a particular compo-nent in the architecture model), and an allele set (the proces-sors that it can be mapped onto) are defined. For each chan-nel in the application model, a communication requirementis defined only if that channel is mapped onto an externalmemory element. Hence, we neglect internal communica-tions (within the same processor) and only consider external(interprocessor) communications.  Architecturemodeling ThearchitecturemodelsinSesamecanalsoberepresentedby a graph AR = ( V   A , E  A ), where the sets V   A and E  A denote thearchitecturecomponentsandtheconnectionsbetweenthem,respectively. For each processor in an architecture model, wedefine the parameters processing capacity, power consump-tion during execution, and a fixed cost.Having defined more abstract mathematical models forSesame’s application and architecture model components,we have the following optimization problem. Definition 1 (MMPN problem [12,13]). Multiprocessor mappings of process networks (MMPN) problem ismin f  ( x  ) =   f  1 ( x  ), f  2 ( x  ), f  3 ( x  )  subject to  g  i ( x  ), i ∈{ 1, ... , n } , x  ∈  X   f  ,(1)  4 EURASIP Journal on Embedded Systemswhere f  1 is the maximum processing time, f  2 is the totalpower consumption, f  3 is the total cost of the system.The functions g  i are the constraints, and x  ∈ X   f  are thedecision variables. These variables represent decisions likewhichprocessesaremappedontowhichprocessors,orwhichprocessors are used in a particular architecture instance. Theconstraints of the problem make sure that the decision vari-ables are valid, that is, X   f  is the feasible set. For example, allprocesses need to be mapped onto a processor from their al-lele sets;oriftwo communicating processesaremappedontothe same processor, the channel(s) between them must alsobe mapped onto the same processor, and so on. The opti-mization goal is to identify a set of solutions which are supe-rior to all other solutions when all three objective functionsare minimized.Here, we have provided an overview of the MMPN prob-lem. The exact mathematical modeling and formulation canbe found in [12]. 3.1. Multiobjectiveoptimization To solve the above multiobjective integer optimization prob-lem, we use the (improved) strength Pareto evolutionary algorithm (SPEA2) [14] that finds a set of approximatedPareto-optimal mapping solutions, that is, solutions that are not dominated in terms of quality (performance, power, andcost) by any other solution in the feasible set. To this end,SPEA2 maintains an external set to preserve the nondomi-natedsolutions encounteredso farbesidesthesrcinal popu-lation.Eachmappingsolutionisrepresentedbyanindividualencoding, that is, a chromosome in which the genes encodethe values of parameters. SPEA2 uses the concept of domi-nance to assign fitness values to individuals. It does so by tak-ing into account how many individuals a solution dominatesand is dominated by. Distinct fitness assignment schemes aredefined for the population and the external set to always en-sure that better fitness values are assigned to individuals inthe external set. Additionally, SPEA2 performs clustering  tolimit the number of individuals in the external set (withoutlosing the boundary solutions) while also maintaining diver-sity among them. For selection, it uses binary tournamentwith replacement. Finally, only the external nondominatedset takes part in selection. In our SPEA2 implementation, wehave also introduced a repair mechanism [12] to handle in-feasible solutions. The repair takes place before the individu-als enter evaluation to make sure that only valid individualsare evaluated.In[12],wehaveshownthatanSPEA2implementationtoheuristically solve the multiobjective optimization problemcan provide the designer with good insight on the quality of candidate system architectures. This knowledge can sub-sequently be used to select an initial (platform) architectureto start the system-level simulation phase, or to guide a de-signer in finding for example alternative architectures whensystem-level simulation indicates that the architecture underinvestigation does not fulfill the requirements. Next, we con-tinue discussing implementation details regarding Sesame’ssystem-level simulation framework. PearlVP-AVP-BMapping layerArchitecture modelYXZBAApplication modelYMLMappingA = > XB = > Y     Y    M    L   e     d     i    t   o   rT   r   a   c   e    A    P    I    T   r   a   c   e    A    P    I PNRunner Figure 2: Sesame software overview. Sesames model descriptionlanguage YML is used to describe the application model, the archi-tecture model, and the mapping which relates the two models forcosimulation. 4. THECOSIMULATIONENVIRONMENT All three layers in Sesame (seeFigure 1(b)) are composed of components which should be instantiated and connected us-ing some form of object creation and initialization mech-anism. An overview of the Sesame software framework isgiven inFigure 2,where we use YML (Y-chart modeling language) to describe the application model, the architec-ture model, and the mapping which relates the two mod-els for cosimulation. YML, which is an XML-based lan-guage, describes simulation models as directed graphs. Thecore elements of YML are network , node , port , link , and property . YML files containing only these elements arecalled flat YML. There are two additional elements set and script which were added to equip YML with scripting sup-port to simplify the description of complicated models, forexample, a complex interconnect with a large number of nodes. We now briefly describe these YML elements.(i) network: network elements contain graphs of nodesand links, and may also contain subnetworks which createhierarchy in the model description. A network element re-quires a name and optionally a class attribute. Names mustbe unique in a network for they are used as identifiers.(ii) node: node elements represent building blocks (orcomponents) of a simulation model. Kahn processes in anapplication model or components in an architecture modelare represented by nodes in their respective YML descrip-tion files. Node elements also require a name and usually a class attribute which are used by the simulators to identify the node type. For example, inFigure 3(a),the class attribute of node A specifies that it is a C++ (application) process.(iii) port: portelementsaddconnectionpointstonodesand networks. They require name and dir attributes. The dir attribute defines the direction of the port and may havevalues in or out  . Port names must also be unique in a node ornetwork.  Cagkan Erbas et al. 5 < network name="ProcessNetwork" class="KPN" >< property name="library" value="libPN.so"/ >< node name="A" class="CPP Process" >< port name="port0" dir="in"/ >< port name="port1" dir="out"/ >< /node >< node name="B" class="CPP Process" >< port name="port0" dir="in"/ >< port name="port1" dir="out"/ >< /node >< node name="C" class="CPP Process" >< port name="port0" dir="in"/ >< port name="port1" dir="out"/ >< /node >< link innode="B" inport="port1"outnode="A" outport="port0"/ >< link innode="A" inport="port1"outnode="C" outport="port0"/ >< link innode="C" inport="port1"outnode="B" outport="port0"/ >< /network > (a) YML description of process network inFigure 1 < set init="$i = 0" cond="$i &lt; 10" loop="$i++" >< script > $nodename="processor$i" < script/ >< node name="$nodename" class="pearl object" >< port name="port0" dir="in"/ >< port name="port1" dir="out"/ >< /node >< /set > (b) An example illustrating the usage of set and script elements <  mapping side="source" name="application" ><  mapping side="dest" name="architecture" ><  map source="A" dest="X" >< port source="portA" dest="portBus"/ >< /map ><  map source="B" dest="Y" >< port source="portB" dest="portBus"/ >< /map >< instruction source="op A" dest="op A"/ >< instruction source="op B" dest="op B"/ >< /mapping >< /mapping > (c) The YML for the mapping inFigure 2 Figure 3: Structure and mapping descriptions via YML files. (iv) link: link elements connect ports. They require innode , inport , outnode , and outport attributes. The innode and outnode attributes denote the names of nodes(orsubnetworks)to be connected.Portsusedfortheconnec-tion are specified by  inport and outport .(v) property: property elements provide additionalinformation for YML objects. Certain simulators may re-quire certain information on parameter values. For exam-ple, Sesame ’s architecture simulator needs to read an array of execution latencies for each processor component in order
Search
Similar documents
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks