Follow Me On Twitter Facebook LinkedIn Flickr
Surprisingly I'm rather liking the Amazon elastic compute cloud. Running my first VM instance with my new pet Linux distro ubuntu 10.04 ... 3 weeks ago
A software development and computer technology blog.

Archive for August, 2007

The Universe in code.

If you’re old, and lucky enough to have spent many an hour piloting your spacecraft from star system to star system, and spacestation to spacestation, buying and selling goods, taking on pirates and alien attacks in the game ‘Elite’, by David Braben and Ian Bell. Then you may well have been amazed by the sheer size of the game, hosting thousands of stars, planets and moons, with coordinates, descriptions, stats and stock market prices all tightly packed into 48KB of memory (in the case of the spectrum I owned, that is). Now it’s no mystery how they managed to do this, generating all their data procedurally through a fixed algorithm, so I decided to set about creating my own Universe-generating algorithm for my own projects.

Initially I found an interesting page which gives an example of creating star names from 32 ‘sounds’. So I started from this idea and expanded it to use a lot more ‘sounds’ and a list of optional name endings, which are in the vein of star, planet and moon names found in science fact and fiction, like -rsae, -peia, -ing, -land, etc. It started off very simple and I’d take a look at the resulting names and wonder how I could improve on them, until I arrived at something fairly useable. I’ll probably expand on it a little more and tweak it here and there.

At the time of writing this there are 100 ‘sounds’, which are:

const std::string nameParts[] = {
	"a"              , "e"              , "i"              , "o"              , "u"           ,
	"b[lr][eio]a"    , "b[lr][a]e"      , "b[lr][ae]i"     , "b[lr][aeio]o"   , "b[lr]u"      ,
	"z[eioy]a"       , "z[aeo]e"        , "z[ae]i"         , "z[aeo]o"        , "zu"          ,
	"c[hlr][eio]a"   , "c[hlr][aeo]e"   , "c[hlr][ae]i"    , "c[hlr][aeo]o"   , "c[hlr]u"     ,
	"y[e]a"          , "y[e]e"          , "y[eo]o"         , "d[eiory]a"      , "d[r]e"       ,
	"d[aer]i"        , "d[aeior]o"      , "d[r]u"          , "x[i]a"          , "xe"          ,
	"f[aeilor]a"     , "f[aelor]e"      , "f[aelr]i"       , "f[aeilor]o"     , "f[lr]u"      ,
	"w[aei]a"        , "w[aeo]e"        , "w[ae]i"         , "w[aeo]o"        , "wu"          ,
	"g[aeilory]a"    , "g[aelor]e"      , "g[aelr]i"       , "g[aeilor]o"     , "g[r]u"       ,
	"v[eio]a"        , "v[ael]e"        , "v[ae]i"         , "v[aeio]o"       , "vu"          ,
	"h[eioy]a"       , "h[ae]e"         , "h[ae]i"         , "h[aeo]o"        , "hu"          ,
	"t[h][r][aeio]a" , "t[h][r][ae]e"   , "t[h][r][ae]i"   , "t[h][r][aeo]o"  , "t[h][r]u"    ,
	"j[eio]a"        , "j[ae]e"         , "j[ae]i"         , "j[eo]o"         , "ju"          ,
	"s[hl][aeio]a"   , "s[hl][ae]e"     , "s[hl][ae]i"     , "s[hl][aeo]o"    , "s[hl]u"      ,
	"k[lr][aeio]a"   , "k[lr][ae]e"     , "k[lr][ae]i"     , "k[lr][aeo]o"    , "k[lr]u"      ,
	"r[aeioy]a"      , "r[ae]e"         , "r[ae]i"         , "r[aeoy]o"       , "r[y]u"       ,
	"l[l][eio]a"     , "l[l][ae]e"      , "l[ae]i"         , "l[aeio]o"       , "lu"          ,
	"p[hlr][ey]a"    , "p[hlr][e]e"     , "p[hlr]i"        , "p[hlr][aeo]o"   , "p[hlr]u"     ,
	"m[eioy]a"       , "m[ae]e"         , "m[ae]i"         , "m[aeo]o"        , "mu"          ,
	"n[aeio]a"       , "n[ae]e"         , "n[ae]i"         , "n[aeo]o"        , "nu"          };

The letters within square braces ‘[]‘ are optional characters, where there is a chance that an optional letter may appear and each optional character has an equal chance of being the one that does appear. Some ‘sounds’ have more than one optional character. For example “p[hlr][ey]a” may give “pa”, “pea”, “pya”, “pha”, “plya”, etc. Of course these ‘chances’ are also determined in particular order so as to ensure the names are consistent each time you enter the same seed into the algorithm. You might notice that each ‘sound’ ends with a vowel, this was just to ensure that it could create a good variety of words that (hopefully) weren’t too awkward to pronounce. I found it was quite easy to rectify many unwanted occurances of unpronounceable words by tweaking these strings.

The name-endings, of which there are currently 243, are:

const std::string nameEndings[] = {
	"c"    , "d"    , "g"    , "k"    , "l"    , "m"    , "n"    , "p"    , "r"    , "s"    ,
	"t"    , "x"    , "z"    , "boo"  , "bok"  , "bock" , "beia" , "beus" , "bham" , "bu"   ,
	"bi"   , "bia"  , "bae"  , "buth" , "boz"  , "biz"  , "bic"  , "bol"  , "buk"  , "cham" ,
	"ci"   , "cia"  , "ce"   , "cuth" , "car"  , "col"  , "di"   , "dia"  , "de"   , "duth" ,
	"dar"  , "dol"  , "duk"  , "du"   , "doo"  , "din"  , "dok"  , "dock" , "ding" , "deia" ,
	"dium" , "deus" , "dham" , "fi"   , "fia"  , "fah"  , "fun"  , "foo"  , "ff"   , "foon" ,
	"fael" , "frey" , "fham" , "gi"   , "gia"  , "ge"   , "gah"  , "gun"  , "gol"  , "guth" ,
	"gee"  , "goo"  , "goz"  , "gol"  , "geus" , "gham" , "ho"   , "ham"  , "han"  , "jar"  ,
	"jam"  , "jo"   , "jok"  , "kul"  , "kor"  , "king" , "kar"  , "kol"  , "kik"  , "kia"  ,
	"kol"  , "kham" , "land" , "lam"  , "lor"  , "lar"  , "ler"  , "lad"  , "lik"  , "lia"  ,
	"lus"  , "lim"  , "lphi" , "leus" , "lham" , "mar"  , "mand" , "mir"  , "mor"  , "moor" ,
	"mock" , "mak"  , "mik"  , "mitz" , "ming" , "mad"  , "mia"  , "mae"  , "mus"  , "meus" ,
	"nar"  , "nd"   , "nir"  , "nor"  , "noor" , "nock" , "nak"  , "nik"  , "nitz" , "ning" ,
	"nad"  , "nia"  , "nol"  , "nae"  , "nus"  , "neus" , "nham" , "par"  , "pia"  , "pock" ,
	"pitz" , "por"  , "pak"  , "ping" , "pad"  , "polus", "peia" , "peus" , "pae"  , "poe"  ,
	"pod"  , "pol"  , "phor" , "phus" , "phon" , "phoon", "pha"  , "phir" , "phog" , "phong",
	"phi"  , "pham" , "que"  , "rah"  , "ring" , "rhea" , "rom"  , "rok"  , "rock" , "rus"  ,
	"rag"  , "rol"  , "reus" , "rham" , "seus" , "sol"  , "som"  , "son"  , "sing" , "sand" ,
	"sik"  , "sur"  , "sog"  , "sae"  , "saeus", "sia"  , "sham" , "ti"   , "tia"  , "tae"  ,
	"teus" , "tus"  , "tos"  , "tir"  , "ton"  , "ting" , "tong" , "tom"  , "th"   , "thum" ,
	"tham" , "theus", "thom" , "thium", "thon" , "than" , "wath" , "wart" , "warg" , "wurt" ,
	"witz" , "wham" , "via"  , "vog"  , "voz"  , "vol"  , "vig"  , "vitz" , "xy"   , "y"    ,
	"bay"  , "day"  , "fay"  , "gay"  , "hay"  , "kay"  , "lay"  , "ly"   , "ley"  , "my"   ,
	"ny"   , "py"   , "ry"   , "dry"  , "ndry" , "gry"  , "ngry" , "sy"   , "ty"   , "sty"  ,
	"set"  , "rsae" , "rseus"};

In order to generate the names for a star map, I decided on using a coordinate system based on 3 sets of 3D Integer data (giving 9 digits in total) as the random seed. The first three digits are used as the X, Y and Z coordinates for what I call a ‘Quadrant’, giving a total of 1,000 quadrants in this Universe (Where X, Y and Z can each be 0 to 9), these being the largest unit of area in this particular system. Next is the ‘Sector’, again using 3 digits to represent the X, Y and Z coordinates within a quadrant. This gives 1,000 Sectors per quadrant. Finally, the last 3 digits represent a ‘SubSector’ within the specified Sector. Therefore we can procedurally generate names for 1,000 quadrants containing 1,000,000 sectors hosting 1,000,000,000 subsectors, which is a fair amount I’m sure you’ll agree.

For example, 439,016,224 would be the random seed for quadrant(4,3,9), sector(0,1,6), subsector(2,2,4). Details of the quadrant, including the name, would be determined by it’s 3 digits (439), the sector details would be dependant on the quadrant and sector coordinates (439016). This would give different results for a sector with the same number but in a different quadrant. And finally the subsector details would come from the whole number (439016224). This is reasonable when using 32-bit unsigned integers which handle numbers up to 4,294,967,295 as a single integer can store the whole random seed. Initially I was working on just producing the quadrant, sector and subsector names, where the subsector would also be the name of a star system if one was found to exist there. But this can also be used to procedurally create descriptions, statistics and other data for each item.

For good measure I decided to throw in a little extra option for subsector/star systems, whereby they had a chance of being denoted by a code. For this I chose it to be 2 letters, followed by a number between 132 and 9999, a hyphen ‘-’, a 2 digit number and then a single character either ‘A’, ‘B’, ‘C’ or ‘D’. This is just a preliminary code, until I have time to go into this further.

After this it struck me that although I was using the coordinates as the random seed within a C++ program, the same results would not necessarily appear in Java, Pascal, Php, etc. The only way to ensure that the coordinate numbers always gave the same results would be to write my own random number generator functions. This was in fact incredibly easier than I first thought, because Wikipedia actually had a set of pseudo code listed for the Mersenne Twister random number generator algorithm, and porting this to C++ was straightforward.

And this is currently as far as I have reached with this little sub-project. Although I intend to create a PHP version to place on a webpage whereby you can select a particular set of coordinates, that is a particular quadrant, sector and subsector/star system and receive names for them, or just click a button and receive a random, procedurally generated name.

Here’s a few examples of the output the current program gives for the first 16 quadrants, and the first sector and subsector/system name in each of those quadrants:

[Q000]:zamee,       [S000]:hexebay,   [s000]:klikelu
[Q001]:dokol,       [S000]:moa'gonir, [s000]:vocosaa
[Q002]:taejoneus,   [S000]:wedabok,   [s000]:ER3317-89D
[Q003]:benus,       [S000]:bujihe,    [s000]:tijuthom
[Q004]:xiaha,       [S000]:loamy,     [s000]:traopoe
[Q005]:peseus,      [S000]:claivoory, [s000]:mebajei
[Q006]:jubu,        [S000]:ritralam,  [s000]:julopak
[Q007]:charom,      [S000]:gagham,    [s000]:mizibu
[Q008]:owaakik,     [S000]:vaklu,     [s000]:AM8968-34A
[Q009]:wileateus,   [S000]:wer,       [s000]:mosufham
[Q010]:hia,         [S000]:xalo,      [s000]:waita
[Q011]:zaxavi,      [S000]:hetrulo,   [s000]:gileus
[Q012]:idi,         [S000]:trumo,     [s000]:duci
[Q013]:duulay,      [S000]:kunep,     [s000]:sevo
[Q014]:kacucla,     [S000]:sholuwart, [s000]:hajuhi
[Q015]:chaewaecuth, [S000]:nobiz,     [s000]:zema

I wonder how much alloys are going for in the Gileus system… You know the Gileus system, it’s in the Hetrulo sector on the edge of the Zaxavi quadrant…