This essay is part of a series with 'Worlds andFlats' and 'The spread of knowledge in a large gameworld'; if you haven't read those you may want to
read them before reading this. This essay describes how a large
world can come into being and can evolve. I've written again on this subject since - see 'Populating a game world')
In practice we can't go that big for all sorts of space and time reasons. Recording land heights is inevitably an issue. I don't know of a pseudo random function which will generate satisfying land heights. Anything based on Julia sets, for example, ends up with landforms symmetrical around a central point. Furthermore, the shapes of fractals which can be constructed from simple functions tend to have a noticable and unnatural degree of self-similarity across scales. I'd dearly like to be wrong on this, but I think we need to store at minimum elevation values at ten metre intervals. If we can accept 100mm resolution for elevations, storing 16 bit values gives a range of 6,500 metres - 21,000 feet - from the deepest seabed to the peaks of the highest mountains.
This means that landform information alone requires 20Kbytes per square kilometre - unindexed, but seeing it's a rigid ten metre grid that isn't a problem. Which, in turn, means that you can store landform information for a planet the size of Earth in one terrabyte. But we don't need a planet the size of earth. Scotland is 80,000 square kilometers of land area; allowing for a bit of sea around to the horizon around it, say 100,000 square kilometers. That seems to me more than big enough to be a game space. It amounts to 160Mb of landform data, which is completely manageable.
If we stored plant data for every distinctive plant in Scotland - even at one per ten square metres - that does become an impractically large data set, because quite apart from anything else, the plant locations do have to be indexed. But actually, given that the actual plants that grow are a function of the location at which they grow, no player is going to notice if the pattern of the locations of plants is the same for each square kilometre. So we can manage plant data for a land area the size of Scotland in 400,000 bytes - we could do it in less (if the locations were generated using a pseudo-random function, much less).
Building data is different. We need to store the latitude, longitude and type of every building explicitly, and again they need to be indexed in order that we can recover the buildings in a given area efficiently. We need about 16 bytes per building (four bytes latitude, four longitude, two type; then for each tile a null-terminated vector of pointers to building records). If we assume that our feudal land of 80,000 square kilometers has a population of a million, and that there are on average five occupants of every building, that's two hundred thousand buildings, implying 3.2Mb of data.
Of course, that's just the backing store size. As tiles are loaded into active memory - see the essay 'Tiles and Flats' this raw backing data has to be inflated procedurally into actual models that can be rendered; models which may have thousands of vertices and hundreds of kilobytes of textures. The functions which do that inflating have some finite size, and, significantly, they'll need to work on prototype models which will in turn take storage space. Finally there are hand-edited models potentially used at particular plot locations; those need to be stored more or less in full. But all this has not become an unmanageable amount of data. It seems to me plausible that you could store a fully populated 100,000 square kilometer game world on one uncompressed 700Mb CD. On a 4Gb DVD, you could do it very easily.
Microworld
Some twenty years ago I wrote a rather sophisticated cellular automaton which I called 'Microworld' which modelled the spread of human population over a landscape. It did this by first fractally folding a grid to assign elevations to cells. Then, cells below a critical elevation the tree line were assigned as forest. For each cycle 'year' a cell remained forest, its soil fertility would increase. Random events 'lightning strikes' could change a cell from forest to clearing. Then the following transitions might take place, each with a probability, where each cell is considered to have eight neighbours:- A forest cell with a lightning strike as a neighbour may catch fire and burn
- A forest cell with a fire as a neighbour may catch fire and burn
- A burning cell become a clearing cell
- A clearing cell with forest or scrub as a neighbour may become scrub
- A scrub cell may become forest
- A clearing cell (including cells above the treeline) may become a pastoral cell (pastoral 1, no settlement)
- A pastoral 1 cell whose soil fertility falls below a threshhold becomes waste
- A pastoral 1 cell with no pastoral neighbours may become waste
- A waste cell below the treeline may become scrub
- A waste cell may become clearing
- A pastoral 1 cell with two or more pastoral neighbours may become a pastoral 2 cell (settlement)
- A forest cell with two or more pastoral neighbours may become clearing
- A pastoral 2 cell with two or more pastoral 2 neighbours may become agrarian 1
- An agrarian 1 cell which falls below a critical fertility becomes pastoral 1
- An agrarian 1 cell with three or more agrarian 1 neighbours becomes agrarian 2 (smith, mill)
- A cell with three or more agrarian 2 neighbours becomes market
- A market cell with no agrarian 2, market or urban neighbours becomes waste
- A cell with two or more market neighbours becomes urban
Microworld Two
The objective of this essay is to outline an angorithm for creating inhabited landscapes in which games can be set, which are satisfyingly believable when rendered in three dimensions. The objective of creating landscapes 'procedurally' that is, with algorithms is that they can be very much larger than designed landscapes for the same richness of local detail. This does not mean that every aspect of the final landscape must be 'procedural'. It would be possible to use the techniques outlined here to create landscapes which were different every time the game was played, but it would be equally possible to create a landscape which was frozen at a particular point and then hand edited to add features useful to the game's plot. And while I'm principally thinking in this about role playing games, this sort of landscape would be applicable to many other sorts of games strategy games, god games, first person shooters...The physical geography
Consider our landscape as, once again, a fractally folded sheet on which any given point has characteristics based on its elevation and orientation. There are two critical levels water level and treeline. The water level is, overall, sea level, but in the case of a localised depression it is equal to the lowest land height between the depression and the sea (lakes form in depressions). Computing the fractal sheet forms stage one in computing the landscape. Next, we need functions which, for any given point on the landscape, compute two different dimensions of soil fertility: water and warmth. We'll assume a coriolis prevailing wind blowing from the west, bringing in damp air from an ocean in that direction. Western slopes are wetter than eastern slopes. In principle, also, there's likely to be a rain shadow to the east of high ground leading to considerable aridity, but that may be too expensive to compute. Rain runs swiftly off steeper slopes, more slowly on flatter ground, so flatter ground is wetter than steeper ground. Water flows down hill, so lower ground is on the whole wetter than higher ground. This isn't a precise model of soil hydrology, but I think it's good enough. From each lake a watercourse follows the lowest possible path to the sea. Watercourses modify the land overwhich they flow, carving out a route at least sufficient to carry the amount of water collected in the watershed above each point. Where watercourses flow down steeper gradients, they carve out gullies, possibly with waterfalls. Where they cross shallower gradients or level ground, they become broader. Computing the watercourses becomes the second stage of computing the lanscape.Vegetation
Now sprinkle seeds randomly across the landscape at a density of roughly one every ten square metres. Seeds which fall in water, ignore (? or make into water plants?). The position of the plant is taken from the random sprinkling. The species and size of the plant that grows from the plant are a function of the water and warmth functions described above, with latitude and longitude as seeds for pseudo-random functions delivering aspects like branching and so on enough to make individual plants distinct and not carbon copies even of other plants of the same species, but nevertheless recreatable from just the latitude and longitude. So for each plant only two integers need to be stored, yet every time a player passes he will see an identically recreated world. Of course there is a trade-off between storage space and rendering time, and it may be more efficient to build and cache a detailed model of each plant. Like a lot of other things it depends on the game being designed and the processing power of the platform on which that game is delivered. As to how the functions which select the vegetation type work, obviously trees grow better in wetter places, grassland plants in dryer places; within the wetter places, coniferous trees are more prevalent where it is cooler, broadleaves where it is warmer. In the very wettest places, willows, alders and marshland plants. These plants the seeded plants are the feature plants of the landscape. When rendering the landscape the renderer would first apply a suitable local surface texture, for example, in grassland areas, grass.Settling the world
So now we need to make this an inhabited landscape. My proposal for this is to introduce proto-actors, which may be the same software agents as the non-player characters the user will interact with (see my essay on the spread of knowledge). At this stage in their lifecycle, the proto-actors are fairly simple state transition machines. Generally, their algorithm is as follows: Starting from one or two seed points, proto-agents will initially move across the landscape travelling at most 20Km in a day, preferring to stop at inns or else at existing settlements; and will maintain a history of the places they have been, never revisiting a place until they have settled. Whenever moving, whether before they have settled or after, proto-actors will plan their route across the landscape, avoiding trees, buildings, and steep gradients, and will prefer to cross rivers at a bridge (if available) or else a ferry (if available), or failing that at the narrowest convenient point. When proto-actors settle, they will claim an area of territory appropriate to their trade more below; the system must build up a database of land holdings. In particular a land holding will never cross a watercourse, an existing road or overlap another land holding (although roads may develop across existing holdings). This is key because I don't want holdings normally to have regular shapes. A settled proto-agent will build a building to live in, and possibly an additional one for his trade. When building buildings, proto-actors will prefer to build at the edge of their land holding, as close as possible to existing buildings and ideally at the side of an existing road. The richer an existing building is, the more attractive it will be to new buildings. Buildings will be built with their long edge aligned with the edge of the owner's hoding.- A proto-actor is initially, as described above, an itinerant. Itinerants are introduced into the world at a small number of geographical locations, and gradually, not all at once. Itinerants travel as described above. As they move they will leave breadcrumb trails with a roughly ten metre resolution. If they cross an existing track which goes in roughly the right direction they will prefer to follow it. Once a track has been followed by a certain number of proto-actors, it becomes a road.
- An itinerant who finds an area of unsettled grassland of ten hectares with low soil fertility and not more than one hundred trees settles and becomes a pastoralist. He builds a cottage.
- An itinerant who finds an area of unsettled grassland of ten hectares with medium or high soil fertility becomes an agrarian. He builds a homestead. Depending on the fertility of his land he can support between zero and ten labourers, 10% of a smith, 10% of a miller and 10% of a bonnet laird.
- An itinerant who finds an area of unsettled land of 100 square metres within five hundred metres of a homestead with unfulfilled labourer demand becomes a labourer. He builds a cottage.
- An itinerant who finds an area of unsettled land of 100 square metres within five kilometres of ten farmers with unfilled smithing slots becomes a smith. He builds a cottage and a forge.
- An itinerant who finds an area of unsettled land either at the side of a water course or at the top of a hill, and within 5 kilometers of ten farmers with unfilled milling slots becomes a miller. He builds a mill water or wind, appropriate to location.
- Any settler who plays host to more than a certain number of travellers becomes an innkeeper. He claims 400 square metres of unclaimed land as close as possible to his existing settlement and buids an inn and stableyard.
- An itinerant who finds 400 square metres of unclaimed land within a certain distance of an inn and a smith will become a merchant, provided that there are three smiths within a 5Km radius who have unfilled market slots. The merchant builds a marketplace and a counting house.
- An itinerant who finds 200 square metres of unclaimed land within a specified distance of a market with an unfilled chapel slot becomes a priest and builds a chapel and manse, and possibly a school.
- An itinerant who finds 100 square metres of unclaimed land adjacent to where a road crosses a river becomes a ferryman.
- A ferryman who carries more than a certain number of passengers in a given period becomes a tollkeeper and builds a bridge.
- the number of people for whom it is the nearest market
- the sum of the wealth (soil fertility) of the homesteads for which it is the nearest market
- the wealth of other markets within a day's travel
Rendering the buildings
This seems to me to provide an algorithmic means of settling a landscape which will generate organic and satisfying patterns of settlement. But it all fails if the buildings are chosen from a limited palette of models. As with the trees I think we need algorithmic mechanisms of building similar-but-different buildings which can be repeatably rendered from relatively small data sets. As an example of what I mean, in damper landscapes where wood is likely to be available, there might be a higher probability of stave buildings, or weatherboarding, with mainly shingle roofs. In slightly less damp areas where timber is still available, cruck frames and half timbered buildings will prevail, with mostly thatched roofs. In the dryest areas, cob and brick buildings will be common, often with tile roofs. On steeper hillsides, stone buildings will be common, perhaps with slate roofs. Within each of these types there are essential cells from which a building is created. These cells can be longer or shorter, taller or lower, wider or narrower. A building may comprise a single cell, or more. If more than three cells they may be arranged in a row or round a courtyard. And they may have one story or two. Which they have can be based like the details of the plants on functions which take latitude and longitude as arguments and which, internally use pseudo-randoms seeded from those latitude and longitude values.How vast a world?
OK, so, with this general approach, how big can we go? The answer seems to me to be 'big enough'. A 32 bit integer gives somewhat over four billion values, so can resolve down to one millimetre precision in a world 4000 kilometres by 4000 kilometres. But we don't actually need millimetre resolution; centimetre would be quite small enough. And that gives us potential for a world 40000Km square, or 1.6 billion square kilometres, which is three times the surface area of planet Earth.In practice we can't go that big for all sorts of space and time reasons. Recording land heights is inevitably an issue. I don't know of a pseudo random function which will generate satisfying land heights. Anything based on Julia sets, for example, ends up with landforms symmetrical around a central point. Furthermore, the shapes of fractals which can be constructed from simple functions tend to have a noticable and unnatural degree of self-similarity across scales. I'd dearly like to be wrong on this, but I think we need to store at minimum elevation values at ten metre intervals. If we can accept 100mm resolution for elevations, storing 16 bit values gives a range of 6,500 metres - 21,000 feet - from the deepest seabed to the peaks of the highest mountains.
This means that landform information alone requires 20Kbytes per square kilometre - unindexed, but seeing it's a rigid ten metre grid that isn't a problem. Which, in turn, means that you can store landform information for a planet the size of Earth in one terrabyte. But we don't need a planet the size of earth. Scotland is 80,000 square kilometers of land area; allowing for a bit of sea around to the horizon around it, say 100,000 square kilometers. That seems to me more than big enough to be a game space. It amounts to 160Mb of landform data, which is completely manageable.
If we stored plant data for every distinctive plant in Scotland - even at one per ten square metres - that does become an impractically large data set, because quite apart from anything else, the plant locations do have to be indexed. But actually, given that the actual plants that grow are a function of the location at which they grow, no player is going to notice if the pattern of the locations of plants is the same for each square kilometre. So we can manage plant data for a land area the size of Scotland in 400,000 bytes - we could do it in less (if the locations were generated using a pseudo-random function, much less).
Building data is different. We need to store the latitude, longitude and type of every building explicitly, and again they need to be indexed in order that we can recover the buildings in a given area efficiently. We need about 16 bytes per building (four bytes latitude, four longitude, two type; then for each tile a null-terminated vector of pointers to building records). If we assume that our feudal land of 80,000 square kilometers has a population of a million, and that there are on average five occupants of every building, that's two hundred thousand buildings, implying 3.2Mb of data.
Of course, that's just the backing store size. As tiles are loaded into active memory - see the essay 'Tiles and Flats' this raw backing data has to be inflated procedurally into actual models that can be rendered; models which may have thousands of vertices and hundreds of kilobytes of textures. The functions which do that inflating have some finite size, and, significantly, they'll need to work on prototype models which will in turn take storage space. Finally there are hand-edited models potentially used at particular plot locations; those need to be stored more or less in full. But all this has not become an unmanageable amount of data. It seems to me plausible that you could store a fully populated 100,000 square kilometer game world on one uncompressed 700Mb CD. On a 4Gb DVD, you could do it very easily.