Wednesday 18 December 2013

Modelling the ships


Yes, really I have been working on the novel today, but the actual text is not much further forward. I'm now up to 20,500 words, which is progress but not fast. However, I've been doing a lot of background research on speeds of camel caravans, speeds of iron age sailing ships, and other details; and I've been working on my calendar, to make sure all the right characters can plausibly be where I need them to be at the right dates, given the transport they have available to them.

Interestingly, this exercise has brought back to mind (and validated) my previous essay, 'The spread of knowledge in a large game world'. Because of the speed of the ships (more below) my protagonist sails from the northern city where he's spent the early summer thirty five days after his home city has been conquered - but he does not know of it, because there's no plausible way for news to reach him. On his way he visits The City at Her Gates, and there he may plausibly pick up news of the conquest of his home - but it's equally possible that he won't, since although the news could have reached there if a convenient ship has happened to make the passage, it's possible that it hasn't.

The Ship

But most importantly, I've been modelling the ship.

The plot driver for Merchant is the arrival of a disruptive technology: new ships which are seaworthy enough to circumnavigate the continent, and thus bypass the bottlenecks of the caravan road which has been the main trade route for hundreds of years. So, I need a ship. But since this world is not based on real history, it has to be an ahistoric ship: a plausible, effective type of ship which would work and could have been developed in the real world, but wasn't.

Rig

So what I've gone for is a semi-elliptical squaresail. That is to say, a fully battened sail which hoists and reefs like a Chinese junk sail, but which tacks like a square sail: it has no specific luff or leach. This actually has some advantages. A junk mast cannot effectively be stayed, because when tacking the sail sweeps through the area where useful stays would have to be. Consequently, junks have unstayed masts. With a semi-elliptical  squaresail, you would be able to have stays on the mast, provided they allowed the yards to be braced round to 45 degrees. Also, with a semi-eliptical squaresail, you can set the aerodynamic curve of the sail to a considerable extent by using curved battens (semi-elliptical battens, hence the name). So this sail would be very much more effective upwind than a conventional square-sailed ship. Also, as the sail is reefed by lowering it, it's not necessary to send men aloft to reef it.

Such a rig could only develop in a place with tall, straight trees - forest-grown conifers, probably, because although unlike junk rigs the mast can be stayed, it can only be stayed at the very top - you can't hoist the sails past the attachment point.

Of course, Shearwater would not be as effective upwind as a modern racing yacht. She could not lie closer than 45 degrees to the wind and in practice wouldn't lie closer than 50 degrees; she'd also make a certain amount of leeway, so 60 degrees off the wind is a likely most effective course. But downwind she'd be pretty good, and on a reach she would be very effective. As I'm positing a prevailing westerly coriolis airflow modified by strong convection over the steppe giving rise to reliable onshore winds all along the southern coast, a 'trade route' going east along the north coast, south along the east coast, west along the south coast and north along the west coast would involve very little if any beating to windward - for a ship which could lie a close reach, as this one could.


She has no staysails - she could have, but I'm assuming that either they haven't been invented or haven't been found advantageous - so she has no bowsprit and her bow is short and cobby. Her forecastle extends right to the stem, and is pretty broad at the front, with no front bulwark or rail. This is to make anchor handling easier (but, of course, more dangerous). She's a cargo ship, a beast of burden, and she's not slim or elegant; but on a 40 metre overall length she has a payload of at least 40 tons of cargo - possibly significantly more, her displacement will be more than twice that.

Cargo

Not only does she carry at least as much as 160 camels, she can make two round trips from north to south in the time a camel caravan can make one. This surprised me. But camel caravan speed is about 32Km/day, whereas the speed of a classical period merchant ship was about 220Km/day. much greater. So even if (as I'm assuming) the sea route round the continent is three times as long as the caravan road, the ship, with seven times the speed, makes the journey in less than half the time. That in turn means - something I hadn't seen until I worked this out - that not only could she outperform the camel caravans on the cargoes they can carry, she could carry perishable cargoes - perhaps fruit, for example - that they couldn't carry. No wonder the cities which depend on the caravan trade feel threatened!

As a cargo ship she has a cargo derrick to aid loading and unloading positioned over the main hatchway.


She has some features which are there simply because I needed them in the story; for example the shore boat I mentioned in chapter one is visible stowed on the main hatch; the stair on which Dalwhiel stands to talk to Karakhan is visible, the companionway out from the after castle is there, and even has an arched roof to add headroom.

Building the model of the ship helps me to visualise it and consequently to be able to write more confidently about it; it's the same reason I have town plans of my cities. I can even visualise her in the context of them.




Monday 16 December 2013

In praise of LuminusWeb

Well, I've just finished my first in-anger, for-a-paying-customer, website in Clojure. Essentially it's just a very simple CRUD system; it landed on my desk last week in a rush because the agency which had been supposed to be building it had drawn some pretty pictures and then thrown their hands up in the air and said 'this is too hard'.

It wasn't a requirement it be written in Clojure; in fact, until I tacked a credits line on the bottom of the pages saying 'Powered by Clojure' I don't think the customer knew that it was. I estimated four days on a fixed price basis; I think this was fair. In fact it took six, but I worked over the weekend so the project hasn't been delayed by my overrun.

Some of the overrun was unforeseen - the agency who abandoned the job had built the forms in JotForm, and they proved to be so horrible that I had to rewrite them from scratch - the HTML was bizarrely bad, and none of the field names were meaningful. Also, the pretty pictures drawn by the agency were all fixed size - they didn't flow or scale. I admit I'm a snob, but if a website is going to be identified as my work I want it to be right. So now, it is right (well, mostly; the navigation does something ugly and not very usable if you shrink a desktop browser window too small, but I haven't yet found a workaround for that which fits with the design). Now, it uses media queries to distinguish smartphones, tablets and desktops, and serves the appropriate CSS for each. Yes, that cost me time, too, but in my opinion it's worth it.

And, of course, there was a bit of feature creep - there always is. The customer asked for several features which hadn't been in the original specification, and I've delivered them - but they all add time.

But the overrun I did foresee - learning time - was much less than I would have expected, and that was down to LuminusWeb. LuminusWeb is not really so much an integrated toolkit in its own right as a collection of tools from a range of developers. It doesn't use all the tools which, if I was selecting for myself, I would choose; for example it uses Selmer for generating dynamic content rather than the, in my opinion, much more elegant Enlive. But what LuminusWeb does provide - which makes all the difference if you're building something against a deadline - include

  • a great project template which sets up most of your scaffolding
  • a very clear tutorial, and
  • excellent documentation.

LuminusWeb's scaffolding and examples, with a little help from a couple of books


allowed me to build what I needed very quickly, and with remarkably little pain.

LuminusWeb is also very up-to-date; it uses current or near versions of all the tools it depends on.

There is one thing I'm disappointed with. I had hoped to integrate Chris Emerick's Friend authentication package to do OpenID authentication, but I struggled with the documentation (despite good example programs from both Chris himself and from Assen Kolov), and simply couldn't get this going in time. I did work out how I would have done proper database-layer authentication, but in the end I simply did application layer authentication because it was quick and easy.

Altogether, it's been a very pleasant experience and I'm kicking myself for not having done it sooner.

Sunday 8 December 2013

On Rascarrel Shore

My cross bike on Rascarrel Shore, 2010
When I was a wee boy, we used to go down to Rascarrel shore fairly regularly, and I used to play pooh-sticks in the burn from the wee footbridge. It wasn't, as a boy, my favourite beach, being stony and windswept, but it was under a mile and a half from our house up at Nether Hazelfield, and the old medieval road from Rascarrel shore up to Rerrick, although by then abandoned, ran by the end of our lane and was still recognised by the farmers as a public right of way. By the time I was ten, I was permitted to walk down to Rascarrel on my own.

Rascarrel was not a pristine shore. It was part of a populated landscape. It had been occupied by the barytes miners, by miners for coal and copper, by fisherfolk and by smugglers over the historical period.  John Thomson's map of 1832 shows a coal mine and a smithy on the shore. There are dwelling sites all along the bay. The beach itself is rocky, and strewn with large, rounded pebbles; there's nowhere that's sandy, nowhere to launch a boat, and few places where it was easy for a child to get into the water to swim. Consequently, as a child, I preferred Balcary for sailing and Red Haven for swimming. But Rascarrel was our nearest beach, and the only one I could get to by myself, so I went there often. There is a natural rock arch which I've loved all my life and have many photographs of, and several minor caves.

There had been a little barytes mine at the end of Rascarrel shore; it must have been a small works, no more than a couple of men and a boy in all likelihood, because the ruin of the miner's cottage is a wee place. The mine was last worked in my lifetime, but I don't recall ever seeing that cottage with a roof on. What I do recall, however, was three separate groups of cheerful, well maintained summer huts, owned by people mainly from Dumfries. There was one group of two or three huts by the miner's cottage; a group of six or seven just to the east of the burn, where the old medieval road comes to the shore; and another group of three or four about a hundred metres west of the burn, accessed over that footbridge. They had been there since the 1930s, and were a well established part of the landscape.

In the early seventies, the footbridge washed away in a spate and was replaced by a new, higher bridge, I'm pretty sure at the council's expense. As is appropriate; it was a public good on a public right of way. In summer, Eddy Parker used to keep his boat moored in the burn mouth; an exposed place, but it was a tough old boat and Eddy knew what he was up to. I would sometimes anchor my boat off there in the summer. We would go down in autumn and winter to collect mussels off the rocks, or pick up driftwood along the strand.

As Margaret Thatcher's recession closed in in the 1980s, the poorer folk of the village went down onto the shore at Rascarrel to collect mussels which they sold to the shellfish factory in Kirkcudbright - hard work for little money - and the larger mussels soon vanished in consequence. About this time, too, the council constructed an improved car-park at the shore end of the lane, at public expense. During the 1990s I used regularly to ride a mountain bike on the loop round Rascarrel shore, and back by Balcary Heughs. The place has always been part of my life.

My niece Zoe in Bill's hut on Rascarrel shore, 1983
All this time the hutters came and went, and maintained their little cabins well. I didn't know any of them, but I would greet them in passing, as you do. There was no hostility between village folk and the hutters. In the early eighties one of the huts was lived in by a friend of my sister's, Bill, and we used to walk my niece Zoe who was then about one down to the beach in her wheelchair, and have a mug of tea in Bill's hut while he yarned or played guitar.

Rascarrel farm was then owned by a family called Hendry, who had a daughter a bit younger than me. She married one of the MacTaggart lads, and together they inherited the farm. The hutters continued to quietly enjoy their huts, but there were complaints about increases in rent demanded by MacTaggart, and some of the huts fell into disrepair.

In 2004, the MacTaggarts decided to evict the hutters and build their own holiday chalet development on the site. There was a long legal fight, but in 2007 it was lost and the hutters were evicted. The MacTaggarts then applied for planning permission, which was fiercely resisted by Auchencairn Community Council. However, after two applications were turned down by Dumfries and Galloway council, a third application was appealed to the Scottish Government, who overruled local objections and approved the scheme.

In the past year, the MacTaggarts have pressed ahead with the plan, destroying ancient woodland and bulldozing new house sites out of the cliff. They've erected lights, security cameras, and alarms, and angrily harass members of the public using the beach and the track to it. See, for example, this tweet:
During the course of the year, 'old' Mr MacTaggart has died; but his son continues the same aggressive policy.

On the Roy Military Survey of Scotland, 1747-1755, the lands of Rascarrel are not enclosed. It was then common land. It's not marked as enclosed land on subsequent surveys either, but then they typically don't mark enclosures whereas Roy's map did, so they aren't evidence. Still I should be interested to know when, and under what legal theory, the right of common was extinguished?

On the shore road out of Auchencairn towards Balcary, reed-beds are naturally reclaiming the bay. There's an area that was open mud when I was a boy, and then reed-bed, and then, as land was established, rough grass; and now the householder across the road has put fences round it and claimed it as his own - again, on what legal theory I don't know. There is an assumption these days that land must belong to someone, that anything as dangerous and subversive as unowned land must be extinguished as quickly as possible. When Rascarrel was 'just a farm', rented from Auchencairn estate, did the rented area extend to the tideline? When the estate was parcelled up and sold off, were the lands to the tideline sold with it? I'm guessing so, or someone would have protested before now. But, more significantly, what claim apart from arrogance and fiat did Auchencairn estate have to the shore?

Whatever the legal theory, what has happened on Rascarrel shore in the past decade amounts to enclosure of common, since this was land freely enjoyed by the people of Auchencairn and the surrounding area for leisure, for recreation, for fishing for generations. It amounts also to the extinguishing of our rights under the Land Reform Act. It also undermines the rights of all hutters, everywhere in Scotland. If we aren't prepared prepared to defend those rights, we will lose them.

A first look at Tchahua

For anyone interested in the story I'm working on, this is a first look at my model of the city of Tchahua. The deep water quay is front right; the bridge (front left) is roughly modelled on the medieval London Bridge, but without the buildings that were built on it.

Obviously the castle (centre right) is the Residence; I've shown the outer ward walls lower than the inner ward, partly because for defence you would want to be able to shoot down on them if the outer ward was captured, but partly because it's a different phase of building - the outer ward would have been added when the deep water quay was developed. Facing the Residence (middle of the picture) is the guildhall and and a couple of large buildings which I'm currently vaguely thinking of as probably inns. I suspect there should be a religious building somewhere but I haven't got that worked out yet - there may be several temples because I don't think the Dragon cult is a state religion so there are probably other cults.

Upstream of the bridge - left of picture - is the old harbour, which older, smaller ships could access through the lifting span of the bridge; the two large buildings on the quayside there are the silk warehouse and the spirit warehouse. I think the spirit warehouse is the one nearer the Residence, but again that isn't certain.

The area between the guildhall and the Residence is the market place and will not be built on, but the area between the guildhall and the city wall should be densely packed with dwellings and workplaces. Upstream of the warehouses should be the richer merchants houses, probably quite grand.

I'm viewing pitched roofs as probably because it's coastal with regular onshore winds drawn in by convection over the steppe inland; it's likely to have moderate rainfall. I haven't yet made up my mind about building technology, but I think there is a mix of mainly-timber and mainly-stone buildings. There probably isn't any brick because the local stone is limestone, so there are unlikely to be clay deposits.

To the left of the city - under the bridge - is the eponymous river, which is the same river which runs underground under the city of Hans'hua. I'm not yet fully committed to what the arm of water to the right of the city is. It may be a shallow, marshy bay, or it may be another river. The city wall spans the peninsula from one shore to the other, but does not actually enclose the city. In the story there are sections of it which are in poor repair.

Looking at this model it's obvious that there should be something significant at the end of the peninsula, bottom right of this view. I'm not yet sure what that would be. As I originally imagined it, the Residence pretty much was the end of the peninsula, but when I modelled it like that it didn't look like real geography.

There must be a kink in the river downstream of the harbour, as at Kirkcudbright, for example, to shelter ships in harbour from storms. I haven't got the wider geography pinned down yet. Obviously this is the same continent as in my 'Modelling river systems' note, and the precise location of Tchahua is the estuary towards the western end of the south coast. But that's rough geography, and the benefit of it being my world is I can chop it around as suits me.

There's a lot that isn't in this picture, yet. Obviously there need to be dereks on the quays, boats here and there, lots more houses, public wells - the stuff that makes a city viable. Also, this is a city which has grown considerably richer in the past decade and a half, so there should be a lot of building going on (which may be why the wall is in poor repair - people may have been robbing building stone out of the city wall.

Saturday 7 December 2013

More grief creating formatted documents

I write my fiction using a 'word processor' which is in fact no more than a hacked together set of shell scripts. To produce final proof output, I need a tool to render my text into nicely formatted PDF or Postscript. I do this by way of HTML, but I still need a tool for the HTML to PDF step. For years I've used Prince, which is very good indeed. It has three problems from my point of view
  • It's proprietary software, and although you can legitimately use it for free, if you do it prints its own logo on the cover page of your document;
  • It's too expensive (US$ 495) for me to be able to really justify a license;
  • And finally - for me this one's the killer - it doesn't run on Debian, and because it isn't free, you can't just compile it yourself.
It does run on Ubuntu, and consequently I do run Ubuntu on one of my machines just so that I can run Prince, but I now want to run my fiction through my continuous integration toolchain, which runs under Jenkins on my server; and my server runs Debian.

Consequently over the years I have periodically evaluated other options - genuinely free software options - for doing my final formatting step. So far, I've found nothing good enough. Today, I've tried again.

There are two new options, pandoc and wkhtmltopdf.

pandoc

Pandoc is an ambitious project to create a Swiss army knife for converting between text document formats - to do for text documents what ImageMagick does for raster graphics. To generate PDF it depends on LaTeX, and can use a variety of different LaTeX libraries to achieve this effect. Unfortunately, the LaTeX stage crashes and I'm not sufficiently up on debugging LaTeX to work out why, although the error message suggests it's to do with not finding the right fonts:

simon@engraver:~/Documents/fiction/slave$ pandoc -o merchant.pdf merchant.html 
pandoc: Error producing PDF from TeX source.
! Font T1/cmr/m/n/10=ecrm1000 at 10.0pt not loadable: Metric (TFM) file not found.
                   relax 
l.100 \fontencoding\encodingdefault\selectfont

So from my point of view, that's a fail.

wkhtmltopdf

Wkhtmltopdf uses the Webkit rendering engine - the same one used in the Konqueror, Chrome, Safari and now Opera browsers - to render the page. You'd think that would be bound to be a good one. Unfortunately, it isn't. It doesn't honour - presumably because browsers don't need to - all the print-oriented vocabulary of CSS

My stylesheets specify different page margins for left and right hand pages, to put the wider margin in the gutter; they specify that left hand pages should show the book title in small caps on the left of the header line, and the page number on the left of the footer line; while right hand pages should show the current chapter title in small caps on the right of the header line and the page number on the right of the footer line. They specify that an image on the cover should bleed to the edge of the page. They specify that references in the table of contents should be resolved to the page number on which the content appears. They specify even that pages in the frontmatter should be numbered in roman numerals, while pages in the body should be numbered in arabic numerals.

Wkhtmltopdf honours none of this. It doesn't show page headers at all, or page numbers. It can't resolve table of contents references. It won't bleed the cover page differently from the rest of the content. To be fair, it has to be said that the Amazon Kindle, which really ought to, does not honour this vocabulary either; but seventeen years after the publication of the CSS1 specification and ten years after Prince XML was launched, it's a bit disappointing.

Worse, wkhtmltopdf occasionally splits a single line of text over two pages, so that the top half of the letters appear on the bottom of one page while the bottom half are at the top of the following page.

So that too is a fail; not quite such an epic fail as pandoc, but not good.

So it looks as though I'm stuck with Prince; one of these days I may even have to buy it.

Postscript

It transpires that although Prince is not supplied packaged for Debian, there is a 'generic linux' version (a gzipped tar containing, inter alia, an install script which installs neatly and correctly into /usr/local/) which works perfectly on Debian. So I'm happy again.

Thursday 5 December 2013

Restating the case against Land Value Tax

Andy Wightman, author of the Scottish Green Party's report on A Land Value Tax for Scotland has challenged me to 'crunch some numbers' to demonstrate that land value tax does (as I contend) effectively subsidise the over-exploitation of marginal land, and also effectively subsidise large estates. The argument is not essentially numeric, it's essentially logical, but nevertheless I'll attempt to do so.

Note that I'm not arguing that Land Value Tax is inapplicable in urban areas - on the contrary, in urban areas it may well be a very good tax. I'm arguing that it has consequences in rural areas which act directly and diametrically against the cause of land reform.

First I'd like to introduce the two characters of a three act drama.

Meet the Neighbours

Meet Alice. Alice lives in wee house on a small holding outside Obaig, where she owns ten hectares. Her neighbour, Boris, owns a thousand hectares, entirely surrounding Alice's holding. Boris's holding includes a hundred hectares of well drained valley land alongside a public road; the remainder of his land is, like Alice's, high, wild and remote from public infrastructure.

Alice, as I've said, lives on her holding, and owns nothing else. She needs to make her living either from it or from the local economy, where wages average £26,000 per annum, or about £21,000 after tax and national insurance. Boris lives mainly in his Monaco home, for tax reasons, but he works in an international private bank, and he also has homes in London, New York and Geneva. His salary, before bonuses, is £1,000,000, and as his residence is Monaco he pays little tax on this.

Following Scotland's independence in 2016, Land Value Tax is introduced. A temporary assessment of £50 per hectare is levied on the lands around Obaig, because assessing the notional unimproved rental value of land is complicated and expensive. So Boris is assessed at £50,000, while Alice is assessed at £500.

Alice keeps eight cows on her holding - it's poor land, and that's as much stock as it can carry. She produces, in an average year, four heifers, three of which she can sell on at about £500 each (she needs to keep one for herd replacement); and four steers, which she sells on at about £350. The gross profit on her holding is thus less than £3000, and the tax is a sixth of that. However, Alice can't afford to go to law to challenge the assessment, so she pays it.

Boris's factor runs 100 breeding cows on his valley land, and so its profit - £37,500 - is somewhat more per hectare than Alice's. On his 900 hectares of hill land, he runs 600 hinds, producing about 400 surplus deer per year; he shoots some of these himself for his own pleasure, but lets most of the shooting for a gross income of £120,000 (£100 each for 200 hinds plus £500 each for 200 stags). That adds up to £157,500. So he could more easily afford his assessment of £50,000 than Alice can hers of £500, but he considers it worth his while to pay £5,000 to a firm of Edinburgh lawyers who challenge the temporary assessment, and so the government reassesses his land at £30 per hectare.

This continues for five years, before the government manages to get the lands properly assessed. The new assessment is £30 per hectare for the hill land, and £60 per hectare for the valley land. So Boris sells his valley land, moves his cattle up onto the hill, and again challenges the assessment. The tax authorities, eager to avoid an expensive legal fight, lower the assessment on his land (but not on Alice's) to £20 per hectare. His land is now overgrazed, and consequently deer keep breaking through the fence into Alice's property. She carries out urgent repairs to her fences; as they're march fences she asks Boris to contribute half the cost. But Boris's favourite lawyers argue that the old fence was perfectly adequate and that Alice's repairs are excessive. They offer to pay only 10%, and as the poor have no lawyers, Alice has to accept this.

Also, the Range Rovers and Porsche Cayennes used by Boris and his sporting customers have torn great holes in the 5Km track leading to Alice's holding, and it's become too rutted and potholed for Alice's fifteen year old Golf to cope. Her deeds say that the cost of repairs to the track should be divided evenly between all the landowners the track serves, which is her and Boris. But she can't afford to pay 50% of the costs. So either she continues to live in her house but can no longer get out to do her £26,000 job in Obaig; or else she rents a flat in Obaig in which case she can no longer get in to feed and tend her cattle in winter. So she sells her land to Boris for much less than she thought it was worth, and six months later is homeless on the streets of Glasgow.

What are the morals of this story?

The cost of assessment

Firstly, Land Value Tax is complex and expensive to assess, since it's a notional, variable proportion of an uncertain rental value. It's bound to be open to challenge, but only those with large holdings can afford the legal fees to hire the lawyers to challenge the assessment. What that means is, people with larger holdings will pay less tax than those with smaller holdings of similar land, because the assessment authorities will (reasonably) price in the risk of legal challenge.

The economies of scale

Inequity of holding size

Second, people with larger holdings have more economic options for their use. For example, Alice cannot effectively run deer for shooting, because deer need a larger range, and 'sporting' shooters want the experience of the 'romantic' bleak open moorland, not relatively small fenced areas. And if Alice doesn't fence and allows her deer to roam across the land boundary, she's likely to lose more to shooting by Boris' customers than gain from Boris' deer straying onto her land. And if it comes to a legal fight, once again the poor have no lawyers, so Alice loses.

Geometry and encirclement

Again, suppose an electricity company came along seeking to erect a wind turbine. Suppose the turbine was on Alice's land, she would get the rental from the turbine itself, but Boris would get the rental from the wayleave for transmission across his land - if he even granted that wayleave. If the turbine is on Boris's land, Alice gets nothing. If Boris suggests to the electricity company that he might get difficult about wayleaves if they put the turbine on Alice's land, they'll almost certainly choose his - it's easier and cheaper for them.

Generally, if a large holding is surrounded by many smaller holdings, then the owner of the large holding, if seeking a wayleave for, for example, a new access track, has many potential partners to negotiate with and can play one off against another, pushing down the cost. But a small landowner surrounded by one or two larger holdings does not have that option.

Geometry and linear features

Again, consider that access track. 4.9 Km of it is across Boris's land, allowing Boris to get close to, to feed or shoot, deer anywhere along it. Boris's vehicles, as well as Alice's, use that section, and it gets a lot of wear and tear. Boris can get to his land without crossing Alice's land; he can if he wants build other tracks which don't go near Alice. One hundred metres of the track are across Alice's land; that section gets much less wear, since Boris's relatively high powered vehicles don't use it. But it's normal in Scotland for costs of repairs to a private access track to be evenly divided between the landowners the track serves; it's the arrangement we have here. This suits large landowners far better than small.

Ninety-eight hectares of Boris' land are within one hundred metres of the track, and so easily serviced by it. But supposing the track goes to the middle of Alice's land, only six hectares of her land are. But she pays the same cost of access to those six hectares as Boris pays to service his ninety eight.

Suppose, also, that both Alice and Boris require to fence their land from their neighbours. The length of fence required scales with the square root of the area, so Alice must pay for 1265 metres of fence at about £1 per metre, or £120 per hectare. Boris needs only ten times as much fence, and so pays £12 per hectare (of course, these are march fences and so the cost would normally be shared 50/50 with the adjoining land-owner; so the costs would be half as much, but the scaling factor is the same).

Marginal and submarginal land

Many of Scotland's landed estates are 'sporting' estates on very marginal land. The land is marginal, in many cases, precisely because of a long history of mismanagement by estate owners: they're overgrazed so there's no natural regeneration of trees, the topsoil is washing away. Land which was farmed productively by small holders three hundred years ago is now unfit for any agricultural use. And of course it's remote. It's remote because the landowners have cleared the people off the land, so there are no villages for public roads to serve, so there are no public roads to serve the villages which aren't there. It's rmote because estate owners have resisted the construction of public roads across their land. Because it is overgrazed into wet desert and unserved by public infrastructure, the value of this land is very low. So the land value tax will be very low.

What this means

Because large estates can afford good lawyers and can challenge what are essentially uncertain assessments, they will tend to pay lower rates of tax per hectare than smaller holdings. So Land Value Tax benefits large estates. Because large estates have inherent economies of scale, and land value tax takes no cognisance of this, land value tax benefits large estates. Because large estates have created large areas of Scotland that are depopulated and ecologically impoverished, they have created conditions in which they will pay very little tax on vast areas of land.

Of course, just because land value tax is a bad land tax doesn't mean that land shouldn't be taxed. Of course it should. But, as I've argued elsewhere, there are better ways.

In summary, Land Value Tax hurts small holders particularly on fertile land and close to public infrastruture where there is a clear public policy interest in encouraging agriculture, and helps owners of large sporting estates. It should be opposed, not promoted, by land reformers.

Wednesday 4 December 2013

Nae Gods, an' precious few heroes: no place for racism in Scotland

Nationalism in Scotland is in ferment, on the boil, full of interesting ideas and cross-currents. We're enthused and stimulated by preparation and campaigning for the referendum. Ideas from the left and right are encountering one another, and sometimes we're a bit shocked at what we see.

Dennis Canavan is right. We do have to keep our eye on the ball. We do have to work together to achieve the win, because we still do have a hill to climb. So maybe I'm talking out of turn. I've already taken a pop at the old left, and now I'm going to talk about what I see as the misty-eyed romantic right: the idea that there is an authentic culture of Scotland, a true ethnicity: and that that culture, that ethnicity, is Gaelic.

What started me running is this multi-part article on the (generally excellent) Newsnet Scotland website. The article, too, is generally accurate. However, I believe it stretches a point about the extent of Gaelic in early medieval Scotland until it creaks. And in that creaking I hear an echo of special pleading which sounds to me distinctly racist.

I don't claim to know much about the paleolinguistics of Scotland north of the Antonine wall. That there was a general progression from mainly Brythonic/Pictish to mainly Goidaelic I believe, although I would be surprised if the hegemony of Goidaelic culture over Pictish was as complete as shown. South of the Antonine wall I know more, and in Galloway I know quite a lot. And I'll say this: the map reproduced here, taken from the third part, is just wrong: factually and intellectually dishonest, and spinning a myth about Scottish identity which just isn't true. In 1000 AD and for half a century afterwards, the Brythonic Kingdom of Strathclyde still had its caput at Dumbarton Rock, north of the Clyde, and still extended south into south Ayrshire and Dumfrieshire. To colour that area blue, for Gaelic, is not compatible with the historical record.

As far as Galloway goes, there was limited Goidaelic penetration into Wigtownshire - mainly the Rhinns - during the dark age and into the early medieval period (but note that eighth and ninth century place names in the Machars - Wigtown, Whithorn, Glasserton, Sorbie - are Anglian). There are also several kirk-compound names (Kirkcowan, Kirkinner, Kirkmaiden, Kirkmedan) which include the Anglian 'kirk' element but in Brythonic or Goidealic word order. There are, of course, also a large class of names where the Brythonic and Goidaelic forms are sufficiently similar not to be diagnostic. So prior to the wars of independence all we can say is that Galloway as a whole and Wigtownshire in particular had an ethnic mix through this period with Anglian and Brythonic names for most elite settlements but some Goidaelic settlement names appearing in the West. It appears from the Latin forms used by the scribes of the Lords of Galloway from Fergus through to Alan that their native language was Welsh.

After the Wars of Independence - from about 1340 - everything changes. Many place names disappear altogether; many new, primarily Goidaelic, place names emerge. Some names (e.g. Hazelfield -> Auchencairn) are translated directly from an earlier language into Gaelic. This seems to represent a very widespread depopulation, with new Gaelic-speaking immigrants spreading in from the west.

And then by 1600 it's all changed again. Gaelic has vanished; all new names are in what we would now call Scots.

So there was a period of Gaelic ascendency in Galloway and I'll not deny it, but it was short - 250 years at most. This is, historically, one of the most mongrel parts of our mongrel nation, ethnically mixed as far back as you can go. Dammit, we know that the Romans settled veterans from both Syrian and Nubian (black African) legions on farms both north and south of the wall, and it's a very fair bet that the descendants of those people are still here.

Ireland has, historically, been one of the sources of Galloway's population, but it's one of many. The oldest inhabitants we know of were Brythonic, and their descendants are still here. The Anglians came in after the Kingdoms of the North were defeated at Cattraeth. The Norse came in sporadically along the coast in the ninth and tenth centuries (Eggerness, Borgue, Southerness, Almorness, Heston, Tinwald, Torthorwald). And the Irish came in. Given the importance of the trade route, we must assume some population movement back and forth between the Rhinns and the Belfast area going back into the stone age, but Gaelic is neither the original nor the most important nor the most enduring language of Galloway.

We don't need ethnic cleansing here. We don't need racism. There is no one true Scottish race. There is no one true Scottish language. Very few of us are of  a single, pure, ethnic stock. We're mongrels. Mixed. The product of a complex, turbulent history. That's who we are. And it's who we've always been. King David the First didn't address his charters to 'omnibus hominibus tocius terra sue, Francis et Anglicis, Scottis et Galwensibus' for nothing. When William the Welshman won the Battle of Stirling Bridge, his colleague Andrew Moray was of Flemish extraction; and they were followed as leaders of the Scottish cause by Robert de Brus, a Frenchman with Norwegian ancestry.

Galloway (and Scotland) is not and never was a Gaelic province. It is not and never was Welsh or English or French or Norwegian or Syrian or Nubian. It was and is a glorious mixture of all these things and more, and we should celebrate that. This is no time for misty eyed, ill informed romantic myth making, and it's no time for racism. As Dick Gaughan sang, it's time now to sweep the future clear of the lies of a past that we know was never real.

Tuesday 26 November 2013

The old Left, and the new Scotland

Unity and discussion: need for friendly criticism

One of the themes we heard repeatedly at the Radical Independence Conference this weekend was calls for nationalisation: nationalisation of the banks, of Grangemouth, of the oil industry. This makes me very cautious. Of course, conference speeches are not places for nuance, for detail. It's possible that those who urged nationalisation did not mean the statist, centralising nationalisation of 1945. So I'm cautious rather than hostile.

My intention in this essay is to set out the reasons that I'm cautious. This isn't to criticise anyone; it isn't to be hostile to anyone. As Dennis Canavan said, we must keep our eye on the ball; we must achieve independence, and to do that we must work together as a broad front. We don't need schisms, splits. I'm not seeking to promote those. I'm seeking to start a discussion.

Nationalisation provides new targets for elite capture

In this essay I use 'elite capture' as shorthand for the propensity of well-connected influential elites to establish themselves in positions of power and benefit in institutions and schemes set up for the public good, or for the good of specific minorities. For example, what is known as 'the quangocracy' or 'the great and the good' are well connected elite groups who establish themselves in positions of profit in many public bodies within Scotland and the UK generally.

Nationalisation - the concentration of the whole of an industry within a nation into a single unit owned by that nation - provides a target for elite capture. Western experience of industrial organisation is to have decision making power concentrated at the top, in 'the board'. Elites have many excellent and well-developed strategies for the capture of key concentrations of power. Creating large new top-down structures within society, with control over key economic assets, is just inviting elite capture.

Left elites and right elites

It's natural human behaviour - normal, obvious, we all do it - to advance people we know, people we trust, people like us. When this happens in recognised elites - the soi-disant aristocracy, or the 'old school tie' of the public school. But the same mechanisms operate on the left; and there are what I would describe as 'right elites' as well as 'left elites' in the British Labour movement.

Labour shadow cabinet composition

There are twenty-seven members of the current Westminster shadow cabinet, and actually, when you look at their records, they're a pretty impressive group of people. Jon Trickett is a plumber to trade, a long time peace campaigner, an anti-fascist, and came up through the trades union movement to become a councillor in his home town of Leeds before being elected to parliament. Steve Bassam, a social worker, founded a squatters union and campaigned for the rights of squatters and the homeless, before serving on his own local council and then in parliament. Ivan Lewis set up a learning difficulties support charity at the age of seventeen. He, too, served on his local council before being elected to parliament.

Many of them are clearly exceptionally bright. Mary Creagh, Andy Burnham and the twins Maria and Angela Eagle were all working class kids who went to Oxbridge. Rachel Reeves may also be - I don't have information on her background, but she was certainly educated at New College, Oxford.

Indeed, nine of the twenty-seven - one third - went to Oxford or Cambridge, and there's the first part of the rub. Four of them - through no fault of their own - had parents who were already members of the British elite. Six of them went to fee paying schools. Thirteen of them - almost half - have never worked outside politics and the Westminster village. Taking the intersection of those sets (oxbridge or elite parents or fee paying school or never worked outside politics), nineteen - two thirds - can be classified as elite. Admittedly, that's a crude score; admittedly, as I've said before, many of these - most of these - are pretty impressive people.

But what they are not is the labour movement. What they are not is 'workers'. True, the nature of work has changed over the past fifty years. Fifty years ago, the Labour front bench contained miners, steelworkers, shipbuilders. We can't expect to see such trades now, as industry has vanished from the landscape. But we have one plumber - one! One social worker, an administrative worker, a school teacher, a television journalist, a radio producer. Those, we can all accept are real jobs, and more, real work. Workers' jobs; labour, if you like. Six of them. For the rest, one economist; four academics; only five lawyers. All the rest are wonks.

I would argue that for the most part the Labour front bench are for the most part a 'right elite'. They have, like their Conservative opposite numbers, succeeded at least partly because they are members of old elite structures - inherited privilege (Hilary Benn, Ed Milliband, Hariet Harman, Yvette Cooper); oxbridge or private school; direct entry into politics.

Yes, they are polished, impressive people: elite education does that for you. Put them on a panel of potential parliamentary candidates alongside an engineering worker just off back-shift and of course they'll shine. But there's more to it. Those old elite structures have had hundreds of years to develop the - unwritten, unthought, even unconscious - practices of elite capture.

Flowers affair

Joyce McMillan is of course, perfectly right to argue that the Flowers affair has been blown out of proportion by the right in order to attack the left. However, the Flowers affair has recently highlighted a different sort of elite structure, one which is more clearly a matter of the left. Paul Flowers rose through the ranks of the Labour and Co-operative movements despite the fact that he was frequently discovered to be either useless or a liability. Like Buggins, he was simply shuffled sideways into other posts until he ended up in one in which he could do real damage. Paul Flowers represents a different sort of elite, a left elite, a consequence of the organic development of the left in Britain.

Democratic deficits on the left

What this reveals is the systematic democratic deficit in old British left structures. The left, although it claims to be (and, to be fair, largely aspires to be) democratic, grew up in the Victorian period when telecommunications either didn't exist or else were out of the economic reach of working people. It was natural in the Victorian period to develop a hierarchical system of organisation, with local chapels or branches at the bottom, sending delegates to regional committees which in turn sent delegates to the (national) executive committee. Not every trades union member, of course, makes it to branch meetings - when I was an apprentice printer and a member of the National Graphical Association, we were fined if we failed to attend chapel meetings (and the fines, like our union dues, were deducted from our pay before we got it), but we were not told when or where chapel meetings would take place. The only way to find out was either to be at the preceding meeting, or be told by a friend who had been.

But even ignoring such obvious abuses, the reasons why members may not attend branch meetings are not always just apathy. Union branch meetings tend to run to a formula, and are commonly pretty turgid affairs which take up a lot of an evening. They are often not designed to be inclusive, to be welcoming to the rank and file membership. They tend to select 'in groups'.

But it tends to be branch meetings, not the membership as a whole, who elect delegates to area committees. It tends to be only those delegates who have much contact with the delegates from other areas, so even in those unions where delegates do constitutionally take instruction from their branch on whom to vote for in elections to national committees, the opinion of the branch delegate is likely to be very influential in the branch's choice.

And so it goes. Most trades unions are not participatory democracies. They're not even representative democracies. They're multi-tiered representative democracies, and at each tier the electoral college gets smaller and more self selecting. Of course, in the Victorian period when these structures were established, most trades unions were small, with a few thousand members at most; the process of consolidation and amalgamation over the past hundred and fifty years has further concentrated power, further increased the separation between the people with power - the national executives and general secretaries - and the ordinary membership.

In an electronic age it doesn't have to be like this and increasingly unions do hold direct elections; but the very size of modern unions means that the candidates for office cannot be known to a significant proportion of the membership, so elections - like elections to parliament - have to be on the basis of leaflets of a few thousand words, and such exposure as the candidates can manage to get themselves in trade journals and in the national media.

Egos and personality cults

One of the things which has also badly affected democracy on the left in Scotland has been egos and personality cults. I haven't been directly involved in any of these and I don't really understand the dynamics of them so I won't attempt to analyse the problem but I think we can all accept that it has existed, and that it has tended to act in anti-democratic ways.

Decentralisation and democratic control

The EU has a concept - called subsidiarity - that decisions ought to be taken at the most local practical level of democratic control. Smaller, more local, is inherently more democratic. Of course there are risks of elite capture, petty corruption and cronyism in small local structures just as there are in large, national structures, but such issues cause less damage precisely because they are more local. So what I want to argue is that there are smaller, more local, forms of industrial organisation which democratise control far better than crude old-fashioned nationalisation.

Grangemouth and Govan

The petrochemical installation at Grangemouth, and the shipyards on the Clyde, present particular difficulties for the general solution I propose to the problem of concentration of power and of elite capture, so I'll attempt to characterise those problems before going on to talk about more general issues.

As I understand it the petrochemical installation, although currently divided into two separate functional units ('refinery' and 'chemicals plant') is essentially one integrated facility where the parts are largely dependent on the whole and cannot easily be operated or managed separately. Furthermore, it is as I understand it key to the mechanisms which drive the oil along the undersea pipelines which bring it ashore. It's a big deal, a big plant, and important to the nation. Furthermore, it can't reasonably be expected that its workers can, from their own resources, raise the price of new investments when they become necessary. So it depends inherently on outside sources of finance.

This being so one cannot realistically reorganise the plant into a collection of human-scale workers co-ops. Even if you divide it into one co-op for refinery operations, one co-op for chemical operations, one co-op for engineering and maintenance (for example), you still require overall co-ordination. And you require relationships with external investors/lenders, whether those investors/lenders be conventional venture capitalists, a national investment bank, or a collection of mutual banks. Whoever the lenders/investors are, they will need an effective input into top-level decision making. So you inevitably end up with something which looks very much like a top-down board of directors. If we are to maximise national income from the oil we choose to extract from the North Sea, we need Grangemouth. Making Grangemouth work is a bullet we have to bite.

The shipyards are similar, if not necessarily such an extreme case. A shipyard, like any other large industrial site, has a penumbra of sub-contractors, and those sub-contractors can in general easily be workers co-operatives. Ships are, these days, largely built of modules, and the group of workers who build a module is not necessarily very big. And, in any case, it's likely that in future we will put the marine engineering skills of the Clyde more into building offshore energy generating plant than into building large warships, so again the units of labour do not necessarily need to be as large.

But so long as we are building very large engineering systems on the Clyde, there does need to be some co-ordination. There also needs to be lending or investment. So the Clyde shipyard may need more organisational structure than simply loose associations of small and medium sized workers co-ops.

However, these are extreme cases, and we should not build our overall industrial strategy on extreme cases. Most industrial enterprises in Scotland have at most a few hundred workers; organising these as independent workers co-ops is not hard to imagine.

Banking

The United Kingdom has a small number of very large banks - banks which are deemed 'too big to fail'. We have had very few mutual banks, of which the largest - the Co-op Bank - has just failed. Germany by contrast has many Volksbanken - literally banking co-ops - and, additionally, 431 municipal savings banks and eight state-owned Landesbanks. This is in addition to private sector banks.

As several people at the Radical Independence Conference pointed out, the largest banks in Scotland already are publicly owned. They easily could be nationalised. Yes, indeed they could, but they'd still be too big to fail and they would still be targets for elite capture. Rather than centralising that power as national banks, they could be broken up into their individual branches and given to their account-holders as mutuals or to their workers as workers co-ops. Either way, both account holders and workers have clear common interest in ensuring the stability and profitability of the bank they directly own, so have a clear interest in making sure it is well run. And these individual, small banks would not be 'too big to fail'. Banking regulation would still be needed to monitor that not too many of these many small banks were choosing to run the same risks at the same time, but it could be fairly light touch because the consequence of individual banks failing would be manageable.

In particular these small mutual banks must be empowered to invest in Scottish industry, and, in order to make large investments where those are needed, they must be empowered to combine into associations to make particular large loans or investments.

Industry

Workers co-ops are already a well understood concept in Scotland and are supported as a matter of policy by the Scottish Government and more widely by voices on the left. Rather than nationalising industry, I would far rather see the state set up a series of workers co-ops, each of such a size that the members of the co-op can all know one another at least by sight and reputation - so not more than say one thousand members. Obviously, as I've suggested above with Grangemouth and Govan, for some key industries it may be necessary in some key industries to have some co-ordination between groups of co-ops to allow for efficient running of very large industrial assets, but this should be exceptional not normal. Big may be efficient but it is not always beautiful, and in my opinion there is some trade off between raw efficiency and democratic control. Less wealth spread more evenly may be better than more wealth captured by elites.

Further, I'm not proposing that private industry should be seized and collectivised overnight. I'm suggesting that key industrial assets in which the state has a strategic interest (e.g. Grangemouth, Govan) should be; and that generally, where the state has a controlling interest in an enterprise (for example Prestwick Airport) there should be a presumption that it will be reorganised as a workers co-operative.

Finally I think it would be a good thing if the state provided some systematic incentive for industries to re-organise themselves as workers co-ops; for example, there could be significantly lower levels of corporation tax for co-operatives.

Summary

I do understand why under current circumstances people are calling for nationalisation. Capitalism is out of control and a wholly unreasonable proportion of the common wealth is being captured by a few elite bankers and venture capitalists. But nationalisation not only isn't the only possible solution, it in its turn offers targets which elites - very likely the same elites - will capture.

The alternative which puts power right in the hands of the people most closely involved in it are loose federations of small mutuals and workers co-ops; and I believe it would be as easy to create these as monolithic nationalised industries.

Sunday 10 November 2013

Implementing Milkwood in Java and Clojure

I was recently given, as a coding exercise by a potential employer, this problem.

It's an interesting problem, because the set of N-grams (the problem specification suggests N=3, so trigrams, but I'm sufficiently arrogant that I thought it would be more interesting to generalise it) forms, in effect, a two dimensional problem space. We have to extend the growing tip of the generated text, the meristem, as it were; but to do so we have to search sideways among the options available at each point. Finally, if we fail to find a way forward, we need to back up and try again. The problem seemed to me to indicate a depth-first search. What we're searching is not an 'optimal' solution; there is no 'best' solutions. All possible solutions are equally good, so once one solution is found, that's fine.


Data design

So the first issue is (as it often is in algorithmics) data design. Obviously the simpleminded solution would be to have an array of tuples, so the text:

I came, I saw, I conquered.

would be encoded as

I came I
came I saw
I saw I
saw I conquered

The first thing to note is these tuples are rules, with the first N-1 tokens acting as the left hand side of the rule, and the last token acting as the right hand side:

I came => I
came I => saw
I saw => I
saw I => conquered

To be interpreted as 'if the last N-1 tokens I emitted match the left hand side of a rule, the right hand side of that rule is a candidate for what to emit next.'
The next thing to note is that if we're seeking to reconstruct natural language text with at least a persuasive verisimilitude of sense, punctuation marks are tokens in their own right:

I came => COMMA
came COMMA => I
COMMA I => saw
I saw => COMMA
saw COMMA => I
COMMA I => conquered
I conquered => PERIOD

Now we notice something interesting. It's perfectly possible and legitimate to have two rules with the same left hand side, in this case {COMMA I}. So we could recast the two {COMMA I} rules as a single rule:

COMMA I => [saw | conquered]

This means that, in our table of rules, each left-hand-side tuple can be distinct, which makes searching easier. However, a system which searches a table of N-ary tuples for matches isn't especially easy or algorithmically efficient to implement. If we had single tokens, we could easily use maps, which can be efficient. One can see at a glance that two tokens occur repeatedly in the first position of the left hand side of the rules, 'I', and 'COMMA'.

'I' has three possible successors:

I [came | saw | conquered] 

However the right hand side is not the same for 'saw' as it is for conquered, so this composite rule becomes:

I => [came => [COMMA]| saw => [COMMA]| conquered => [PERIOD]]

This enables us to consider our rules as a recursive map of maps:

[
    I => [came => [COMMA]| saw => [COMMA]| conquered => [PERIOD]] |
    came => [COMMA => [I]] |
    COMMA => [I => [saw | conquered]] |
    saw => [COMMA => [I]]
]
And thus, essentially, as a tree that, given a path, we can walk. Matching becomes trivial and efficient.
Thus far we're almost language independent. I say almost, because in Prolog (which would be a very good implementation language for this problem) we'd simply assert all the N-grams as predicates and let the theorem solver sort them out. However, I've not (yet) tackled this problem in Prolog.

Implementation: Java

My Java implementation is milkwood-java.

I started in Java, because that's what I was asked to do. Java (or C#, which is to a very close approximation the same language) is pretty much the state of the art as far as imperative, procedural languages go. Yes, I know it's object oriented, and I know Java methods are in principal functions not procedures. But it is still an imperative, procedural language. I say so, so it must be true. What I hope makes this essay interesting is that I then went on to reimplement in Clojure, so I can (and shall) compare and contrast the experience. I'm not (yet) an experienced Clojure hacker; I'm an old Lisp hacker, but I'm rusty even in Lisp, and Clojure isn't really very Lisp-like, so my Clojure version is probably sub-optimal.

But let's talk about Java. I made a tactical error early in my Java implementation which makes it less than optimal, too. We have an input file to analyse, and we don't know how big it is. So my first instinct wasn't to slurp it all into memory and then tokenise it there; my first instinct was to tokenise it from the stream, in passing. That should be much more conservative of store. And so I looked in the Java libraries, and there was a library class called StreamTokenizer. Obviously, that's what I should use, yes? Well, as I learned to my cost, no, actually. The class java.io.StreamTokenizer is actually part of the implementation of the Java compiler; it's not a general purpose tokeniser and adapting it to tokenise English wasn't wonderfully successful. That wasted a bit of time, and at the time of writing the Java implementation still depends on StreamTokenizer and consequently doesn't tokenise quite as I would like. If I backported the regex based tokeniser I used in the Clojure version to the Java version (which I easily could) it would be better.
So the first gotcha of Java was that the libraries now contain a lot of accreted crud.

The second point to note about Java is how extraordinarily prolix and bureaucratic it is. My Java implementation runs to almost a thousand lines, of which over 500 lines are actual code (317 comment lines, 107 blank lines, 36 lines of import directives). Now, there are two classes in my solution, Window and WordSequence, which could possibly be refactored into one, saving a little code. But fundamentally it's so large because Java is so prolix.

By contrast, the Clojure reimplementation, which actually does more, is a third the size - 320 lines, of which 47 are blank and 29 are inline comments. I don't yet have a tool which can analyse Clojure documentation comments, but at a guess there's at least fifty lines of those, so the Clojure solution is no more than two fifths of the size of the Java.

The Java implementation comprises eight classes:
  • Composer essentially the two mutually recursive functions which perform depth first search over the rule set, to compose output
  • Digester scans a stream of text and composes from it a tree of rules
  • Milkwood contains the main() method; parses command line arguments
  • RuleTreeNode a node in the tree of rules
  • Tokeniser a wrapper around StreamTokenizer, to try to get it to tokenise English; not very successful
  • Window a fixed length stack of tokens, used as a glance-back window in both scanning and composing
  • WordSequence a sequence of tokens implemented as a queue
  • Writer a wrapper around BufferedWriter which performs on-the-fly orthographic tricks to create a verisimilitude of natural English
One might argue that that's excessive decomposition for such a small problem, but actually small classes greatly increase the comprehensibility of the code.

There are things I'm not proud of in the Java implementation and I may at some stage go back and polish it more, but it isn't a bad Java implementation and is fairly representative of the use of Java in practice.

Clojure implementation

My Clojure implementation is milkwood-clj.

Some things to say about the Clojure implementation before I start. First, I implemented it in my own time, not under time pressure. Second, although I'm quite new to Clojure, I'm an old Lisp hacker, and even when I'm writing Java there are elements of Lisp-style in what I write. Thirdly, although I'm trying to write as idiomatic Clojure as I'm able, because that's what I'm trying to learn, I am a Lisp hacker at heart and consequently use cond far more than most Clojure people do - despite the horrible bastardised mess Clojure has made of cond. Finally, it was written after the Java implementation so I was able to avoid some of the mistakes I'd made earlier.

I used LightTable as my working environment. I really like the ideas behind LightTable and suspect that in time it will become my IDE of choice, but I haven't got it working for me yet. Particularly I haven't got its 'documentation at cursor' function working, which, given my current (lack of) familiarity with the Clojure, is a bit of a nuisance.

I tripped badly over one thing. Clojure, to my great surprise, does not support mutually recursive functions, and the algorithm I'd designed depends crucially on mutually recursive functions. However after a bit of flailing around, I remembered it does support dispatch in one function on different arities of arguments, and I was able to rewrite my two functions as different arity branches of the same function, which then compiled without difficulty.

The other trip was that map, in Clojure, is lazy. So when I tried to write my output using

(defn write-output
    "Write this output, doing little orthographic tricks to make it look superficially
     like real English text.

     output: a sequence of tokens to write."
    [output]
    (map write-token output))

nothing at all was printed, and I couldn't understand why not. The solution is that you have to wrap that map in a call to dorun to force it to evaluate.

Aside from that, writing in Clojure was a total joy. Being able to quickly test ideas in a repl ('Read Eval Print Loop') is a real benefit. But a clean functional language is so simple to write in, and data structures are so easy to build and walk.

Another thing Clojure makes much easier is unit tests. I got bogged down in the mutual recursion part of the Java problem and unit tests would have helped me - but I didn't write them because the bureaucratic superstructure is just so heavy. Writing unit tests should be a matter of a moment, and in Clojure it is.
I broke the Clojure implementation into four files/namespace:
  • analyse.clj read in the input and compile it into a rule tree; more or les Tokeniser and Digester in milkwood-java;
  • core.clj essentially replaces Milkwood in milkwood-java; parses command line arguments and kicks off the process;
  • synthesise.clj compose and emit the output; broadly equivalent to Composer and Writer in milkwood-java;
  • utils.clj small utility functions. Among other things, contains the equivalent of Window in milkwood-java.
Additionally there are two test files, one each for analyse and synthesise, containing in total seven tests with eight assertions. Obviously this is not full test coverage; I wrote tests to test specific functions which I was uncertain about.

Conclusion

Obviously, all Java's bureaucracy does buy you something. It's a very strongly typed language; you can't (or at least it's very hard to) just pass things around without committing to exactly what they will be at compile time. That means that many problems will be caught at compile time. By contrast, many of the functions in my Clojure implementation depend on being passed suitable values and will break at run time if the values passed do not conform.

Also, of course, the JVM is optimised for Java. I've blogged quite a bit about optimising the JVM for functional languages; but, in the meantime, my Java implementation executes about seven times as fast as my Clojure implementation (but I'm timing from the shell and I haven't yet instrumented how long the start up time is for Java vs Clojure). Also, of course, I'm not an experienced Clojure hacker and some of the things I'm doing are very inefficient; Alioth's Clojure/Java figures suggest much less of a performance deficit. But if peformance is what critically matters to you, it seems to me that probably the performance of Java is better, and you at least need to do some further investigation.

On the other hand, at bottom Java is fundamentally an Algol, which is to say it's fundamentally a bunch of hacks constructed around things people wanted to tell computers to do. It's a very developed Algol which has learned a great deal from the programming language experience over fifty years, but essentially it's just engineering. There's no profound underlying idea.

Clojure, on the other hand, is to a large extent pure Lambda calculus. It is much, much more elegant. It handles data much more elegantly. It is for me much more enjoyable to write.

Sunday 3 November 2013

A paen in praise of my stove

 It's time to sing a paen in praise of my stove.

A stove is the heart of any home, particularly so at this time of year. A stove transmutes wood into heat. But heat comes in a number of forms, and we appreciate it in a number of ways. My stove provides me with toasty warm towels from my heated towel rail, when I step out of the bath. It provides me with the hot water for my bath. It provides me with my hot meals, my well cooked food. It heats my oven and bakes my cakes. And, most important of all, it keeps the whole of my house warm and comfortable. And all this for no fuel bills, save the labour of cutting the wood.

So what is this paragon, I hear you ask; how much, I hear you ask, does such a thing of wonder cost?

Well, for a start, it's not an Aga. Agas are, indeed, wonderful things (although I don't know how well they work on wood) but they're vastly out of my price league; an Aga would cost as much as my house. And, they're enormously heavy. Getting an Aga over the hill to my cabin would have been exceedingly difficult. So no, it's not an Aga. More surprisingly, it's not a Rayburn, either. I've installed second-hand Rayburns in every house I've owned until this one. Rayburns are indeed good, although they are not that good if you burn coal - it's too corrosive, and you end up having to replace the grate and firebricks every year. On wood, which is what I have, Rayburns are fine - a Rayburn would have been good. But at the time I built this house, even a second hand Rayburn was out of my budget.

Also, a Rayburn has a small hotplate - efficient, certainly, but small. A Rayburn oven does not have a window in its door, so you can't see how your cake is rising. A Rayburn's firebox is not adaptable. And, like the Aga, it's very heavy.

No, my stove is a thing called a 'Plamak', or 'Plamark' - it's Bulgarian, and in Bulgaria they use cyrillic script; it doesn't transliterate perfectly. Specifically, it's a Plamak B: B for boiler.

Back in the days of the old Soviet Union one could buy Moskvitch and Lada cars; Ural motorcycles; Zenit cameras. They were sturdy but crude, by Western standards. Simple, but very cheap, and they worked. My first car was a Moskvitch van. The Plamak is a little bit like that: honestly made, a little crude in places, but it works. Unlike an Aga or a Rayburn it's made of pressed steel - very nicely enamelled, but just pressed steel. The handles on the ovens and firebox are made of something like Bakelite. The rail across the front on which one can hang teatowels to dry isn't very sturdy and it's a little too close to the body of the stove for convenience. The hotplate is just a plate of steel sheet, and will probably, over time, corrode and need to be replaced. There's no insulated cover for the hotplate. The oven doesn't have a built-in thermometer (but it does have a window in the door, so you can easily put a thermometer inside). Unlike an Aga or a Rayburn, it doesn't have a lot of thermal mass, so when the fire goes down it cools quickly - if you're cooking something that needs a consistent temperature you need to pay attention, and feed it small logs frequently.

But, it has real good points.

The fire box has an extra, removable grate. In summer you can put this grate in, and it halves the size of the firebox, allowing you to cook more economically. In winter, obviously, you take it out. The hotplate is enormous - it will easily take half a dozen pans. There's a very simple flue control which switches the smoke path from across under the hotplate and up the chimney, to round under the oven, depending on what sort of cooking you want to do. And cleaning out that flue path under the oven is absurdly easy - you just lift out the oven floor.

It also burns exceedingly well. Frankly it's too big a stove for this little house - until I installed the big radiator and my heated towel rail, I couldn't effectively use the oven because if I ran the stove hot enough to cook in the oven the hot water tank would boil. Now I can control that, by pumping heat out of the hot water circuit though the radiator (at cost, sometimes, of making the house too cosy - it can easily reach thirty degrees in the bedroom), and so I can bake. It does go through wood fairly quickly - two bucketfulls of logs in an evening - but in two hours it will heat enough hot water for two long, deep, hot baths.

All in all I'm enormously pleased with it. So, you ask, what does this paragon of a stove cost? Amazingly, three hundred and eighty pounds. Honestly, if you want a stove that cooks and heats water, get a Plamak B. It's a bargain.

Friday 1 November 2013

Getting Jenkins CI running on Debian 6 under Tomcat

Today's job was to get a continuous integration server set up and integrated with my Redmine project management system. Since I run Debian 6 on my server, and I prefer where possible to install from the official Debian packages, the Redmine version I'm running is 1.1, which is somewhat behind the curve. I had a look around at which continuous integration server to use. I've tentatively picked Jenkins, the more purist-open-source variant of the Hudson/Jenkins project. Reasons include: it's available in the Debian 7 distribution (but sadly not in Debian 6), and it has a plugin for Leiningen, which is my favourite build tool.

So... on to install, and there the fun began.

Installing Jenkins

As I said, Jenkins is not available in the Debian 6 distribution. However, the Jenkins project had set up their own Debian repository, so after adding their key and link to my system I was able to apt-get it. You'd have thought that would be all, but sadly no.

The Jenkins package, as packaged by Jenkins, does not depend on either Tomcat or Jetty. Instead, it assumes you will be serving no other web-apps and tries to install its own servlet engine (I think Jetty, but to be honest I was too annoyed to check before taking it off again). Obviously, I do have other web-apps, so this didn't work for me. However, I copied the WAR file from the the Jenkins release into /var/lib/tomcat6/web-apps, and, of course, being a web-app, it just worked...

Except it didn't. Jenkins expects to have some space of its own to write to, outside the servlet engine sandbox. That is, in my opinion, bad behaviour. Specifically it expects to be able to create a directory /usr/share/tomcat6/.jenkins, which is bad in two ways: it writes to a directory to which, for security reasons, Tomcat damned well should NOT be able to write, and it creates a hidden file which a naive administrator might not notice and which consequently might not be backed up.

After some thought I decided to put Jenkins writable space in /var/local, so I executed:

root@goldsmith# mkdir -p /var/local/jenkins
root@goldsmith# chown tomcat6.tomcat6 /var/local/jenkins

(I also symlinked that back to /usr/share/tomcat6/.jenkins, but that seems safe enough to me). I then edited /etc/default/tomcat6 (a useful place to put pre-boot Tomcat stuff) and added

# Jenkins home directory: added by simon 20131101                               
JENKINS_HOME=/var/local/jenkins

I then restarted Tomcat:

root@goldsmith# /etc/init.d/tomcat6 restart
Stopping Tomcat servlet engine: tomcat6.
Starting Tomcat servlet engine: tomcat6.

... and all was well; by which I mean, Jenkins started.

Configuring Jenkins for even modest security, however, was a complete bitch.

Jenkins has five different authentication models:

  1. It can have authentication switched off entirely. Anyone can do anything... No. Not going to happen, on an Internet facing server.
  2. It can delegate authentication to the servlet engine. I'm not wonderfully happy about that, because administering Tomcat users is a bit of a pain. 
  3. It can use LDAP... if you have an LDAP server, which I don't.
  4. It can delegate authentication to the undelying UN*X system, but only if the servlet engine can read /etc/shadow! There's NO WAY I'm permitting that. 
  5. It can run its own internal authentication... you'd think that was the obvious one. But as soon as you've selected that option, you're locked out and cannot proceed further.

Fortunately, you can completely reinitialise Jenkins by deleting everything under its home directory and rebooting Tomcat.; it then proceeds to reinstall a default set of files, and you get a new, empty Jenkins.

But, you can't add people to Jenkins until you've configured 'enable security' and chosen one of the security models. So, first, configure 'Security Realm' to 'Jenkins's own user database', and remember to tick 'Allow users to sign up'.

Then, sign up. That bit's easy, it prompts you.

Then, you need an authorisation strategy. Of these, there are five:

  1. Anyone can do anything (aye, right!)
  2. 'Legacy mode' (only 'admin' can do anything)
  3. Logged-in users can do anything
  4. Matrix-based security
  5. Project-based Matrix Authorization Strategy

If you tick 'Matrix-based security' or 'Project-based Matrix Authorization Strategy' and click 'Save', you're locked out again and have to go back to deleting everything in the home directory, rebooting and starting again.

After ticking either 'Matrix-based security' or 'Project-based Matrix Authorization Strategy' (which are, frankly, the only authorisation strategies which make sense), you MUST tick the box which allows the group 'Anonymous' to 'Administer' BEFORE you do anything else. Otherwise, you're stuffed.

So then you try to add a security group, and, wait, you can't. You're stuffed. The 'internal' security model does not have groups, so you must add yourself - your own user ID - to the security matrix, give yourself permission to administer, and then save, and then revoke 'anonymous' permission to administer, and save. Otherwise any Johnny hacker out there in Netland can come along and pwn your server.

To be fair, there are plugins available to add a number of additional authentication methods, including OpenID. I haven't tried these.

Integrating with Redmine

Now, integrating Redmine with Jenkins. Recall that Jenkins is a fork of the Hudson project; they're still pretty similar, and although there isn't a Redmine plugin specifically for Jenkins, there is one for Hudson. I installed that, and on initial testing it appears to work. I wanted to do the integration from the Redmine end, because Redmine does work for me as a project management tool, and I don't yet know whether I shall stick to Jenkins. But the alternative would have been to install a Redmine plugin into Jenkins - that exists; and, indeed, I may install it, as well, since it seems to have some useful functionality.

However, all this still left one gaping hole. Both my Redmine installation and my Jenkins installation were running over plain old fashioned HTTP, which means I was passing passwords in plain text over HTTP, which is asking for trouble - a continuous integration server, simply in the nature of the beast, can do pretty extensive things and would be a wonderful tool for an attacker to control. So I set up HTTPS using a self-signed certificate - I know, but I don't need a better one - and configured Tomcat to communicate only locally over AJP; I then configured the Apache2 HTTP daemon to redirect appropriate requests received over HTTPS via AJP to Tomcat, using mod_jk.

So far so good.

Still to do

I need to integrate Jenkins with Git; I've downloaded the plugins (and downloading and installing plugins for Jenkins is extremely straightforward) but I've yet to configure them.

Creative Commons Licence
The fool on the hill by Simon Brooke is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License