Business & Finance

A framework for interactive generation of music for games

A framework for interactive generation of music for games
of 7
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
  A Framework for Interactive Generation of Musicfor Games Kristopher Reese, Roman Yampolskiy, Adel Elmaghraby Computer Engineering and Computer ScienceUniversity of LouisvilleLouisville, KY 40202 { kwrees02, roman.yampolskiy, adel }  Abstract —Tonal music has had a rich history in Video Gamesand movies, and in fact, music generation has played a minorrole in the history of game music as well. Recent developments inMusic Theory has derived representions of chord progressionsusing geometric topologies. Unlike prior generative music, theframework proposed in this paper attempts to approach tonalmusic generation by building networks of chords using thegeometric topologies. This geometric network of chords can thenbe used inside of reinforcement learning models for learning thebest motions in the progression. The proposed method uses QLearning models, by rewarding acceptable chords, such as Majorand minor chords. Rewards are also given to chords withina specified scale. The proposed framework approaches tonalchord progressions by keeping a tonal center in the progression.Methods for creating interactive and unique music for videogames are also discussed.  Index Terms —Q-Learning, Music, Generative, Interactive I. I NTRODUCTION Music has long been a part of the video game industry, andmuch of the music has become a part of mainstream culturetoday. Concerts dedicated to music from video games havenow become an common event in concert halls across theworld. But much of the production of music for video gamesis often left to composers. In order for the composers to reachthe full gamut of emotions, many different pieces have to becomposed, which can cost excessive amounts of money. Forsmaller gaming companies or independent game programmers,a music system capable of using the full spectrum of chordsand emotions would greatly benefit game design.Music has long been known to be able to affect one’s emo-tional and physiological state. Recent psychological studieshave shown the effect of music on various affective states.Scherer and Zentner [1] lay out how much various features inmusic affect various states including emotion and mood. Thereis an argument as to whether music evokes a genuine emotionor if the listener merely perceives emotions in compositions[2].This argument however is an unnecessary debate as to theeffects that music can have in video games. The field of MusicTherapy has a role in helping to ease various disorders. Ithas been applied to Dementia Care [3], Schizophrenia [4],and Autism [5]. These types of studies show that musichas a very therapeutic effect, and would be well suited forimplementation into therapeutic games.There are of course other effects that music have on humansas well. We often use music, or the lack thereof, to warn us of dangers or as a sign of change for better or for worse. Recentstudies have shown that the stress levels of video game play areaffected by music in a physiological way as well [6]. Thesephysiological effects are caused by the secretion of varioushormones in response to the auditory system.It is both these physiological and therapeutic effects thatmusic can have that can be of interest to serious and therapeu-tic game developers. With this understanding, a more adaptivetype of music may become of interest to these developers, sothat moods and emotions can be affected without the need of large amounts of human composed music.This paper proposes a system that would attempt to mimichuman composer’s ability to write music, but do this on-the-fly as needed by the game. Recent developments in musictheory have provided a means for using systems that imple-ment stochastic decision making to generate chords. It is thisconcept that this paper looks at further.This method differs from grammatical structures in thatthe method used allows the generated music to use the fullspectrum of possible chords, from the traditional chords, tochromatic chords, to the potentiality of non-harmonic chords.The method discussed here is an experiment into the possibil-ity of the use of this method, and the output is not expectedto approach or exceed the output of grammatical methods.The second and third sections of this paper will discuss anddevelop the framework for generating music in the chord pro-gression. Section two will focus on the geometrical topology of chord theory proposed by Tymoczko. The section will focus onmotion in the model based on 2-dimensional topologies, but ageneralization for any  N  -dimensional space will be discussedas well. Section Three will discuss the Reinforcement Learningmodel, called Q-Learning, initially used in this framework.It also discusses the extension of the Q-Learning modelsfor generating chords and the Reward System used for theframework. A method for resolving voice leading issues is alsodiscussed. Section four takes a look at the measure of tonalitythat is used for the generated music, and looks at a passagegenerated by the framework. Section five proposes ideas foruses in Video Games and other Interactive methods for music  Fig. 1. The Two dimensional orbifold described by Dmitri Tymoczko [7]–[9].Movements representation on the two dimensional space and representing themovements of figure 3. The Blue lines are the first to second interval; Greenis second to third; Red is third to fourth; Black is fourth to fifth. generation. Section six concludes the paper. Future work anddevelopments are scattered throughout sections IV-VI.II. G EOMETRIC  M USIC  T HEORY Music theory is riddled with discussions about the languageof music and what makes music sound good to listeners.Because of this, tonal music theory has become more of agrammatical language than a mathematical study. Howeverrecent developments in music theory have begun to discussthe mathematics of music theory. Tymoczko determined thatthere is a latent model that can be used to represent tonal har-monic movement in any  n -note chord using an  n -dimensionalorbifold topology [7]–[9]. This model plays a significant rolein the development of the proposed framework, and will bediscussed in this section.  A. One- and Two-dimensional Spaces Chords in music generally consist of three or more pitchclasses. However, Tymoczko’s model generalizes well to non-chord spaces in music as well. Tymoczko discusses a singlepitch model, which maps to a One-Dimensional topology,which he calls a ”Circular pitch-class space” [7]. He says that,to think about these single pitch classes, we can represent eachpitch class on a line [7], [9]. As with on a keyboard, once thelast pitch has been reached, the pitch classes repeat - so oncewe reach a note of   G , we return to a pitch class of A.We can more formally discuss this using a formal notation. E   − 12 −−→  E   would represent a movement from one pitch classof E to the same pitch class, and is topologically similar toa movement of   E   +0 −−→  E  , though in practice they representanother piece of information. We can read this notation as: “Emoves down 12 semitones (or a musical octave).” This circlealso captures information about the movement of the notes. Anegative motion represents a movement of the note’s frequencydownward (making the note sound lower). A positive motionrepresents an increase in the note’s frequency. Thereby thoughthe two representations are topologically the same, we knowthat in the first notation, the E is higher than the second E. Fig. 2. Movements possible in the two dimensional interval space. Tymoczko puts more emphasis into the understanding of the two dimensional musical space. This is likely becauseunderstanding this is much simpler than attempting to describemusical spaces in 3 or 4 dimensions. By understanding the twodimensional space, one can begin understanding the higherdimensional topologies as well.Figure 1 shows a visual representation of the two dimen-sional intervalic space. This model can be built from thecombination of the circular one dimensional space of eachsingular note. The first note and the second note of the intervalrepresent the way in which movement can be plotted onto atopological mesh.Tymoczko also describes motions of pitch classes in themodel. Contrary movement in music is described as movementof the interval in different directions. In this model, it can bedescribed by vertical movement from one interval to anotherwhere movement upwards represents the notes moving towardsone another, and moving downward represents moving awayfrom one another. Parallel motion is described in music as bothnotes moving in the same direction, which can be representedas movement to the right or left in the mesh. Moving to theright results in parallel movement upwards and moving to theleft is parallel movement downwards [7], [9]. The possiblemovements in this two dimensional space is shown in figure2.A further understanding about what happens when we reachthe end of the graph is also necessary. Tymoczko describes thistwo dimensional space as repeating on the right and left in thesame way a mobius strip works. The right an left sides of thisplot are brought back around and twisted so that the [ F , F ]pairs match up and the [ C  , C  ] pairs match up [7], [9].Using the same formal language that we had mentioned inthe one dimensional section, we can understand movementsin this space as well. The musical passage taken from [10]in figure 3 contains a two voice intervalic passage. We canrepresent the movement of the two voices on the graph. Wenotice that the first movement to the second is in parallelmotion but they do not move the same distance. Therefore   45       Fig. 3. A simple two voice passage containing various intervalic jumps. Thetime signature used is for allowing the music to fit into a single measure. we move in the direction of parallel until we reach one of the voice’s notes and then move the other voice’s note to theproper note along the diagonal. The 2nd to third interval is incontrary motion so we move vertically and then fix the notemovement. The next interval is parallel with some voice fixing,and lastly the 4th to 5th interval are a simple movement of the top voice. These movements along the diagram are shownin figure 1.We can also formally defined these movements as ( C,E  )  +2 , +3 −−−−→  ( D,G )  for the first to second;  ( D,G )  − 1 , +2 −−−−→ ( C,A )  for the second to third; and so on. In this instance,we can extend the formal language to simply include twonotes and two values for change. These concepts are furthergeneralized to any  n -dimensional chord space by Tymoczkoin [9].  B. N-dimensional Space It is in the 3rd and 4th dimensional topologies that we beginseeing what is traditionally understood in music to be a chord.These higher dimensional spaces begin harder to visualize,however Tymoczko’s model generalizes to any dimensionalspace.The third dimensional space is shown in figure 4. This spaceis very similar to the two dimensional space described in theprevious section except containing a third note in the model.Because of this, Tymoczko concludes that the shape of themodel is that of a triangular prism [7]–[9]. This model containstwo folds that are used to connect the edges of the prismtogether. In figure 4, the  ( C,C,C  )  pairs match up as well asthe  ( E,E,E  )  pairs and  ( G,G,G )  pairs [9].[10] concludes that higher dimensional chords would existin more modern music, especially in Jazz where 5-6 notechords are not uncommon. In these spaces, we would needto try to visualize 5 or 6 dimensional topologies. And though Fig. 4. The Three dimensional orbifold described by Dmitri Tymoczko [7]–[9]. these are hard for us to visualize, they could be implemented inthis framework. Because of the difficulty of explaining thesespaces, they will not be discussed in this paper. A furtherdiscussion about these spaces can be found in [9] and [10].III. G ENERATION USING  Q-L EARNING Using the models that were developed, a framework canbegin to take shape that uses these principles in the model tocreate tonal music. Instead of trying to create a grammaticalstructure that our music has to follow, a purely mathematicalapproach can begin to take shape using concepts of DynamicProgramming, Reinforcement Learning, and Topological Ge-ometry. In this section we discuss and modify the Q-Learningmodel to generate chords in a progression.  A. Q-Learning Q-Learning is a reinforcement learning technique developedby Watkins [11]. This method works by learning action-value functions that give an expected utility of a given actionin a given state. Unlike a similar reinforcement learningmethod, the Markov Decision Process, this algorithm gives anapproximation of the Markov Decision Process, speeding uprunning time. Unlike standard path-planning algorithms, thisallows a random movement with some probability that we willnot reach the state that we desired.The Q-Learning algorithm uses a Bellman update equationas part of the algorithm itself. This allows the algorithm toimplicitly define transitions and utilities into the Q matrix thatis created. Later developments extended this model to includea learning rate, creating a delayed Q-Learning model. Thisaddition to the model adds a Probably Approximately Correctlearning model to the Q Learning model [12].Sutton et al. simplified the mathematical equation of Q-Learning with PAC to the equation shown in equation 1. In thisequation  α  represents the learning rate for a state-action pair. R ( s )  represents the Rewards of a given state.  γ   represents thediscount factor. And  Q ( s,a )  represents the current Q-matrixvalue of a state-action pair.  s  represents the state that thealgorithm proposes to move to. Q ( s,a )  ←  Q ( s,a )(1 − α ( s,a ))+ α ( s,a )  R ( s ) + γmax a Q ( s  ,a )  (1)This learning algorithm takes in a state list, an action listfor each state, and a rewards list for each state. We iterate thealgorithm for any specified number of episodes to train thealgorithm sufficiently. During each episode, we chose a stateat random from the states list. Next we run some randomaction from the actions list for the state. With that state-actionpair, we can solve for equation 1. We continue these randomactions until a terminal state is reached in the algorithm. Atthat point, we continue with another episode by choosing anew starting state location at random. This algorithm alwaysconverges on an answer, like its counterpart the MDP [13].After this learning phase has been completed, a Q matrix isreturned that contains estimated utilities for each state-action  pair. With these utilities, a traversal pattern from any locationcan be used to choose actions which maximize the estimatedutility for the state. The action with the highest utility valueis the action we will take.If we run the algorithm on the 4x4 world shown in figure 5,we get the action policy that is show in the figure as well. Wecalculate the Q matrix to any number of episodes. The moreepisode that are run, the more accurate the policy. In a smallworld, a smaller number of episodes tend to converge quickly.  B. Q-Learning for chord progressions In a previous paper, [10], I have discussed methods forcreating actions in the topological mapping discussed byTymoczko in [9]. This modification relies heavily on themodification of the transition matrix in a Markov DecisionProcess as defined by Bellman [14], [15] as well as in [16],[17].In this model, the transition matrix is defined as that shownin left side of the equation in 2. This transition matrix isderived in such a way that Partially Observable worlds willwork we well, known as Partially Observable Markov DecisionProblems (POMDP). However since we know the structureof the world in its entirety, the chord topology is a fullyobservable world and thereby the transition matrix becomesthe probability of an action, which is shown in 2. Thesederivations are further explained in [10]. P  ( s  | s,a ) =  P  ( a )  (2)Since our chords will be moving, we can define our actiona velocity vector,  v x , containing both directional and “speed”information. The speed is simply how far a state will travelin the world next. We can therefore define a chord treatment Fig. 5. A 4x4 world with two terminal states, a positive and a negativeterminal state. All other states have a reward of 0, however this does notneed to be the case. The arrows represent the policy at any given state wherewe try to minimize the chances of entering a negative termination state andmaximize the chance of reaching a positive termination state. by replacing the action with the velocity vector as shown inequation 3 P  ( s  | s,v x ) =  P  ( v x )  (3)Since each note can be considered independent, we canseparate the vector into its independent components such that v x  becomes  v ijk , as shown in equation 4. P  ( s  | s,v ijk ) =  P  ( v i ,v j ,v k )  (4)Using the chain rule on the right hand side, we can furthersimplify the equation as shown in equation 5. Since movementis often dependent on where another note moved, the initialnote,  i  remains independent, the second note,  j  is onlydependent on  i , and the third note,  k , would be dependenton both  i  and  j . We can remove the independent componentsout of equation 5 and are left with equation 6. P  ( v i ,v j ,v k ) =  P  ( v i | v j ,v k ) P  ( v j | v i ,v k ) P  ( v k | v i ,v j )  (5) P  ( v i ,v j ,v k ) =  P  ( v i ) P  ( v j | v i ) P  ( v k | v i ,v j )  (6)As mentioned each vector contains both directional andspeed information about the action. We can then extract outeach of these components such that  P  ( v ijk )  is equal to thatshown in equation 7. P  ( v x ) =  P  ( d x ,sp x ) =  P  ( d x ) P  ( sp x )  (7)Replacing each vector with its speed-direction pair, we areleft with a vector probability shown in equation 8. Since wecan assume that only directions of the notes are dependent onthe previous direction, and that all speeds are independent of both direction and other speeds, we are left with the vectorprobability shown in equation 9. P   ( v ijk ) = P   ( d i ,sp i ) P   ( d j ,sp j  |  d i ,sp i ) P   ( d k ,sp k  |  d i ,sp i ,d j ,sp j )  (8) P   ( v ijk ) = P   ( d k  |  d i ,d j ) P   ( d j  |  d i ) P   ( d i ) P   ( sp i ) P   ( sp j ) P   ( sp k )  (9)Knowing this information we can change the Q learningmodel, exchanging the actions with the velocity vector  v ijk .This results in the change in the Q-Matrix as shown inequation 10. The Q matrix can remain a 2-dimensional matrixby writing a function to us the combination of velocityvectors to identify each uniquely and using that id in placeof the velocity vector. This Q learning methods will capturethose derivations of the transition matrix implicitly using thedeveloped combination function. Q ( s,v ijk )  ←  Q ( s,v ijk ) (1  − α )+ α  R ( s ) + γ max v ijk Q ( s  ,v ijk )  (10)  C. Rewarding the System The initial development of this algorithm used some basicassumptions as well as trial and error for developing a rewardsystem. These numbers were chosen empirically based on thequalitative sound of the result after trial and error. To rewardthe system, we choose specific chords that follow tonal theory.All major chords are rewarded with 1500 points in the system,all minor chords with 150. Augmented and diminished chordsare extremely rare in tonal music and are only rewarded with avalue of 8. Diatonic chords - those that are found in a specificmusical scale - are rewarded with an extra 1000 points. Otherchords that do not fall into these categories should be purelyaccidental and in this initial development, a negative rewardwas given.Terminal states in the system follow tonal music theory’sconcept of cadences in music. These chords are Diatonic andgenerally consist of a chord on the Fifth note of the scale (V),a chord on the fourth note of the scale (IV), or a chord onthe seventh note of a scale (diminished VII). These cadencesare most likely to return to a chord built on the first pitch onthe scale (I). These chords were taken from music theory asdefined by Kostka et al. [18].  D. Resolving Voice Leading Issues With the definitions of actions and tonality defined in thissection, an issue arises that can make, even human written,tonal music sound qualitatively “bad”. Voice leading, or thedecision of the arrangement of notes that are decided upon bymoving from one note to another in a passage, becomes anissue that must be taken into account [18].This does not imply that our assumptions are incorrect aboutmusic. In fact, humans have to consider proper voice leadingwhen writing music as well. A greedy voice leading approachis discussed in [10]. In this method, we take a single notefrom the chord which we are currently playing and calculatethe modular-twelve distance in both directions (“Up” and“Down”) to every note in the chord that we are moving to.This modular arithmetic is important so that values of 12 arerepresented as a 0 in the table.Therefore if we want to move from a C major chord [ C,E,G ]  to a G major chord  [ G,B,D ] , we take the leadingof the first chord - which we will assume is lowest to highest [ G,C,E  ] . We now calculate the motion to the new chord. Thelowest note,  G , moves up 0 pitch classes to  G , 4 pitch classesto  B , and 7 pitch classes to  D . We continue this calculation forall voices in the scale both upward and downward motions. Touse these vales, we simply loop over each voice and choosethe lowest movement from the distances to the next chord.Once we have used an movement, we remove the pitch classof the new chord from the list that we are searching [10].Though in many cases, this does not give the most optimalsolution, it is rare that composers present the most optimalsolution for voice leading as well. The greedy approach isacceptable. A more optimal approach is also presented in[10], however when chords become larger, the time it takesto calculate the voice leading becomes a problem of tetration(or super-exponential growth).IV. D ISCUSSING  T ONALITY Measuring the quality of music is a difficult task. Even if we were to measure how much someone liked the music, thedefinition of tonality does not always include music that peoplefind aesthetically pleasing. Even amongst people, there aredifferent tastes that one has to account for. After creating themusic here, the measure of tonality was not that the musicturned out to be aesthetically pleasing, but that the musicrevolved around a single tone.Figure 6 shows a short 16 measure composition createdby the framework. One can initially see that the music isvery chromatic, as one would expect from other non-tonalgenerative methods. There are a lot of sharps and flats in themusic which might lead one to believe that the music has beengenerated by a computer, and in fact they would be right. Themusic starts on the C major scale [C, E, G] and moves aboutthe world. At the end of the 16 measure, the music ended ona second C Major scale.However, a significant improvement over purely stochasticmethods of composition is that the music continues to revolve,and in fact remain on the pitch class of C in the Tenor voice(second voice from the bottom). This means that there areother parameters that might need to be taken into consider-ation to improve the results of the framework, but that theexperiment was a relative success. By remaining around thekey of C, and in fact ending on the chord which the musicstarted on, was the goal of this framework.One thing that was not taken into consideration is the factthat chord progressions are not a single time task. ChordProgressions happen over a series of beats in the measure (ortime increments). Therefore, an extension to the Q-Learningmodel may need to be looked at to include this temporaldimension.There are many algorithms which could be used to furtherexperiment with this framework. The temporal differencelearning algorithm [19] is a common method for solving thereinforcement learning problem with regards to a time delayedrewards. This could help creating a system that takes anynumber of specified or learned steps to reach the goal.Another approach that can be looked at is learning thereward systems using Bayesian Networks. Music from famouscomposers could be run in a system that learns the chordprogressions and begins to reward certain chords more promi-nently based on what is learned. This would likely make themusic much more pleasing, but still be relatively stochastic.The goal is not to imitateV. I NTERACTIVE  G ENERATION FOR  G AMES One of the benefits of using this reward system approachto generative chord progressions is that any number of rewardsystems can be used to learn the rewards for specific chords.A video game could have auditory music feedback for howwell a user is doing in the system. As the user begins
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!