November 2019

Eyles, Don. Sunburst and Luminary. Boston: Fort Point Press, 2018. ISBN 978-0-9863859-3-3.
In 1966, the author graduated from Boston University with a bachelor's degree in mathematics. He had no immediate job prospects or career plans. He thought he might be interested in computer programming due to a love of solving puzzles, but he had never programmed a computer. When asked, in one of numerous job interviews, how he would go about writing a program to alphabetise a list of names, he admitted he had no idea. One day, walking home from yet another interview, he passed an unimpressive brick building with a sign identifying it as the “MIT Instrumentation Laboratory”. He'd heard a little about the place and, on a lark, walked in and asked if they were hiring. The receptionist handed him a long application form, which he filled out, and was then immediately sent to interview with a personnel officer. Eyles was amazed when the personnel man seemed bent on persuading him to come to work at the Lab. After reference checking, he was offered a choice of two jobs: one in the “analysis group” (whatever that was), and another on the team developing computer software for landing the Apollo Lunar Module (LM) on the Moon. That sounded interesting, and the job had another benefit attractive to a 21 year old just graduating from university: it came with deferment from the military draft, which was going into high gear as U.S. involvement in Vietnam deepened.

Near the start of the Apollo project, MIT's Instrumentation Laboratory, led by the legendary “Doc” Charles Stark Draper, won a sole source contract to design and program the guidance system for the Apollo spacecraft, which came to be known as the “Apollo Primary Guidance, Navigation, and Control System” (PGNCS, pronounced “pings”). Draper and his laboratory had pioneered inertial guidance systems for aircraft, guided missiles, and submarines, and had in-depth expertise in all aspects of the challenging problem of enabling the Apollo spacecraft to navigate from the Earth to the Moon, land on the Moon, and return to the Earth without any assistance from ground-based assets. In a normal mission, it was expected that ground-based tracking and computers would assist those on board the spacecraft, but in the interest of reliability and redundancy it was required that completely autonomous navigation would permit accomplishing the mission.

The Instrumentation Laboratory developed an integrated system composed of an inertial measurement unit consisting of gyroscopes and accelerometers that provided a stable reference from which the spacecraft's orientation and velocity could be determined, an optical telescope which allowed aligning the inertial platform by taking sightings on fixed stars, and an Apollo Guidance Computer (AGC), a general purpose digital computer which interfaced to the guidance system, thrusters and engines on the spacecraft, the astronauts' flight controls, and mission control, and was able to perform the complex calculations for en route maneuvers and the unforgiving lunar landing process in real time.

Every Apollo lunar landing mission carried two AGCs: one in the Command Module and another in the Lunar Module. The computer hardware, basic operating system, and navigation support software were identical, but the mission software was customised due to the different hardware and flight profiles of the Command and Lunar Modules. (The commonality of the two computers proved essential in getting the crew of Apollo 13 safely back to Earth after an explosion in the Service Module cut power to the Command Module and disabled its computer. The Lunar Module's AGC was able to perform the critical navigation and guidance operations to put the spacecraft back on course for an Earth landing.)

By the time Don Eyles was hired in 1966, the hardware design of the AGC was largely complete (although a revision, called Block II, was underway which would increase memory capacity and add some instructions which had been found desirable during the initial software development process), the low-level operating system and support libraries (implementing such functionality as fixed point arithmetic, vector, and matrix computations), and a substantial part of the software for the Command Module had been written. But the software for actually landing on the Moon, which would run in the Lunar Module's AGC, was largely just a concept in the minds of its designers. Turning this into hard code would be the job of Don Eyles, who had never written a line of code in his life, and his colleagues. They seemed undaunted by the challenge: after all, nobody knew how to land on the Moon, so whoever attempted the task would have to make it up as they went along, and they had access, in the Instrumentation Laboratory, to the world's most experienced team in the area of inertial guidance.

Today's programmers may be amazed it was possible to get anything at all done on a machine with the capabilities of the Apollo Guidance Computer, no less fly to the Moon and land there. The AGC had a total of 36,864 15-bit words of read-only core rope memory, in which every bit was hand-woven to the specifications of the programmers. As read-only memory, the contents were completely fixed: if a change was required, the memory module in question (which was “potted” in a plastic compound) had to be discarded and a new one woven from scratch. There was no way to make “software patches”. Read-write storage was limited to 2048 15-bit words of magnetic core memory. The read-write memory was non-volatile: its contents were preserved across power loss and restoration. (Each memory word was actually 16 bits in length, but one bit was used for parity checking to detect errors and not accessible to the programmer.) Memory cycle time was 11.72 microseconds. There was no external bulk storage of any kind (disc, tape, etc.): everything had to be done with the read-only and read-write memory built into the computer.

The AGC software was an example of “real-time programming”, a discipline with which few contemporary programmers are acquainted. As opposed to an “app” which interacts with a user and whose only constraint on how long it takes to respond to requests is the user's patience, a real-time program has to meet inflexible constraints in the real world set by the laws of physics, with failure often resulting in disaster just as surely as hardware malfunctions. For example, when the Lunar Module is descending toward the lunar surface, burning its descent engine to brake toward a smooth touchdown, the LM is perched atop the thrust vector of the engine just like a pencil balanced on the tip of your finger: it is inherently unstable, and only constant corrections will keep it from tumbling over and crashing into the surface, which would be bad. To prevent this, the Lunar Module's AGC runs a piece of software called the digital autopilot (DAP) which, every tenth of a second, issues commands to steer the descent engine's nozzle to keep the Lunar Module pointed flamy side down and adjusts the thrust to maintain the desired descent velocity (the thrust must be constantly adjusted because as propellant is burned, the mass of the LM decreases, and less thrust is needed to maintain the same rate of descent). The AGC/DAP absolutely must compute these steering and throttle commands and send them to the engine every tenth of a second. If it doesn't, the Lunar Module will crash. That's what real-time computing is all about: the computer has to deliver those results in real time, as the clock ticks, and if it doesn't (for example, it decides to give up and flash a Blue Screen of Death instead), then the consequences are not an irritated or enraged user, but actual death in the real world. Similarly, every two seconds the computer must read the spacecraft's position from the inertial measurement unit. If it fails to do so, it will hopelessly lose track of which way it's pointed and how fast it is going. Real-time programmers live under these demanding constraints and, especially given the limitations of a computer such as the AGC, must deploy all of their cleverness to meet them without fail, whatever happens, including transient power failures, flaky readings from instruments, user errors, and completely unanticipated “unknown unknowns”.

The software which ran in the Lunar Module AGCs for Apollo lunar landing missions was called LUMINARY, and in its final form (version 210) used on Apollo 15, 16, and 17, consisted of around 36,000 lines of code (a mix of assembly language and interpretive code which implemented high-level operations), of which Don Eyles wrote in excess of 2,200 lines, responsible for the lunar landing from the start of braking from lunar orbit through touchdown on the Moon. This was by far the most dynamic phase of an Apollo mission, and the most demanding on the limited resources of the AGC, which was pushed to around 90% of its capacity during the final landing phase where the astronauts were selecting the landing spot and guiding the Lunar Module toward a touchdown. The margin was razor-thin, and that's assuming everything went as planned. But this was not always the case.

It was when the unexpected happened that the genius of the AGC software and its ability to make the most of the severely limited resources at its disposal became apparent. As Apollo 11 approached the lunar surface, a series of five program alarms: codes 1201 and 1202, interrupted the display of altitude and vertical velocity being monitored by Buzz Aldrin and read off to guide Neil Armstrong in flying to the landing spot. These codes both indicated out-of-memory conditions in the AGC's scarce read-write memory. The 1201 alarm was issued when all five of the 44-word vector accumulator (VAC) areas were in use when another program requested to use one, and 1202 signalled exhaustion of the eight 12-word core sets required by each running job. The computer had a single processor and could execute only one task at a time, but its operating system allowed lower priority tasks to be interrupted in order to service higher priority ones, such as the time-critical autopilot function and reading the inertial platform every two seconds. Each suspended lower-priority job used up a core set and, if it employed the interpretive mathematics library, a VAC, so exhaustion of these resources usually meant the computer was trying to do too many things at once. Task priorities were assigned so the most critical functions would be completed on time, but computer overload signalled something seriously wrong—a condition in which it was impossible to guarantee all essential work was getting done.

In this case, the computer would throw up its hands, issue a program alarm, and restart. But this couldn't be a lengthy reboot like customers of personal computers with millions of times the AGC's capacity tolerate half a century later. The critical tasks in the AGC's software incorporated restart protection, in which they would frequently checkpoint their current state, permitting them to resume almost instantaneously after a restart. Programmers estimated around 4% of the AGC's program memory was devoted to restart protection, and some questioned its worth. On Apollo 11, it would save the landing mission.

Shortly after the Lunar Module's landing radar locked onto the lunar surface, Aldrin keyed in the code to monitor its readings and immediately received a 1202 alarm: no core sets to run a task; the AGC restarted. On the communications link Armstrong called out “It's a 1202.” and Aldrin confirmed “1202.”. This was followed by fifteen seconds of silence on the “air to ground” loop, after which Armstrong broke in with “Give us a reading on the 1202 Program alarm.” At this point, neither the astronauts nor the support team in Houston had any idea what a 1202 alarm was or what it might mean for the mission. But the nefarious simulation supervisors had cranked in such “impossible” alarms in earlier training sessions, and controllers had developed a rule that if an alarm was infrequent and the Lunar Module appeared to be flying normally, it was not a reason to abort the descent.

At the Instrumentation Laboratory in Cambridge, Massachusetts, Don Eyles and his colleagues knew precisely what a 1202 was and found it was deeply disturbing. The AGC software had been carefully designed to maintain a 10% safety margin under the worst case conditions of a lunar landing, and 1202 alarms had never occurred in any of their thousands of simulator runs using the same AGC hardware, software, and sensors as Apollo 11's Lunar Module. Don Eyles' analysis, in real time, just after a second 1202 alarm occurred thirty seconds later, was:

Again our computations have been flushed and the LM is still flying. In Cambridge someone says, “Something is stealing time.” … Some dreadful thing is active in our computer and we do not know what it is or what it will do next. Unlike Garman [AGC support engineer for Mission Control] in Houston I know too much. If it were in my hands, I would call an abort.

As the Lunar Module passed 3000 feet, another alarm, this time a 1201—VAC areas exhausted—flashed. This is another indication of overload, but of a different kind. Mission control immediately calls up “We're go. Same type. We're go.” Well, it wasn't the same type, but they decided to press on. Descending through 2000 feet, the DSKY (computer display and keyboard) goes blank and stays blank for ten agonising seconds. Seventeen seconds later another 1202 alarm, and a blank display for two seconds—Armstrong's heart rate reaches 150. A total of five program alarms and resets had occurred in the final minutes of landing. But why? And could the computer be trusted to fly the return from the Moon's surface to rendezvous with the Command Module?

While the Lunar Module was still on the lunar surface Instrumentation Laboratory engineer George Silver figured out what happened. During the landing, the Lunar Module's rendezvous radar (used only during return to the Command Module) was powered on and set to a position where its reference timing signal came from an internal clock rather than the AGC's master timing reference. If these clocks were in a worst case out of phase condition, the rendezvous radar would flood the AGC with what we used to call “nonsense interrupts” back in the day, at a rate of 800 per second, each consuming one 11.72 microsecond memory cycle. This imposed an additional load of more than 13% on the AGC, which pushed it over the edge and caused tasks deemed non-critical (such as updating the DSKY) not to be completed on time, resulting in the program alarms and restarts. The fix was simple: don't enable the rendezvous radar until you need it, and when you do, put the switch in the position that synchronises it with the AGC's clock. But the AGC had proved its excellence as a real-time system: in the face of unexpected and unknown external perturbations it had completed the mission flawlessly, while alerting its developers to a problem which required their attention.

The creativity of the AGC software developers and the merit of computer systems sufficiently simple that the small number of people who designed them completely understood every aspect of their operation was demonstrated on Apollo 14. As the Lunar Module was checked out prior to the landing, the astronauts in the spacecraft and Mission Control saw the abort signal come on, which was supposed to indicate the big Abort button on the control panel had been pushed. This button, if pressed during descent to the lunar surface, immediately aborted the landing attempt and initiated a return to lunar orbit. This was a “one and done” operation: no Microsoft-style “Do you really mean it?” tea ceremony before ending the mission. Tapping the switch made the signal come and go, and it was concluded the most likely cause was a piece of metal contamination floating around inside the switch and occasionally shorting the contacts. The abort signal caused no problems during lunar orbit, but if it should happen during descent, perhaps jostled by vibration from the descent engine, it would be disastrous: wrecking a mission costing hundreds of millions of dollars and, coming on the heels of Apollo 13's mission failure and narrow escape from disaster, possibly bring an end to the Apollo lunar landing programme.

The Lunar Module AGC team, with Don Eyles as the lead, was faced with an immediate challenge: was there a way to patch the software to ignore the abort switch, protecting the landing, while still allowing an abort to be commanded, if necessary, from the computer keyboard (DSKY)? The answer to this was obvious and immediately apparent: no. The landing software, like all AGC programs, ran from read-only rope memory which had been woven on the ground months before the mission and could not be changed in flight. But perhaps there was another way. Eyles and his colleagues dug into the program listing, traced the path through the logic, and cobbled together a procedure, then tested it in the simulator at the Instrumentation Laboratory. While the AGC's programming was fixed, the AGC operating system provided low-level commands which allowed the crew to examine and change bits in locations in the read-write memory. Eyles discovered that by setting the bit which indicated that an abort was already in progress, the abort switch would be ignored at the critical moments during the descent. As with all software hacks, this had other consequences requiring their own work-arounds, but by the time Apollo 14's Lunar Module emerged from behind the Moon on course for its landing, a complete procedure had been developed which was radioed up from Houston and worked perfectly, resulting in a flawless landing.

These and many other stories of the development and flight experience of the AGC lunar landing software are related here by the person who wrote most of it and supported every lunar landing mission as it happened. Where technical detail is required to understand what is happening, no punches are pulled, even to the level of bit-twiddling and hideously clever programming tricks such as using an overflow condition to skip over an EXTEND instruction, converting the following instruction from double precision to single precision, all in order to save around forty words of precious non-bank-switched memory. In addition, this is a personal story, set in the context of the turbulent 1960s and early ’70s, of the author and other young people accomplishing things no humans had ever before attempted.

It was a time when everybody was making it up as they went along, learning from experience, and improvising on the fly; a time when a person who had never written a line of computer code would write, as his first program, the code that would land men on the Moon, and when the creativity and hard work of individuals made all the difference. Already, by the end of the Apollo project, the curtain was ringing down on this era. Even though a number of improvements had been developed for the LM AGC software which improved precision landing capability, reduced the workload on the astronauts, and increased robustness, none of these were incorporated in the software for the final three Apollo missions, LUMINARY 210, which was deemed “good enough” and the benefit of the changes not worth the risk and effort to test and incorporate them. Programmers seeking this kind of adventure today will not find it at NASA or its contractors, but instead in the innovative “New Space” and smallsat industries.

 Permalink

Howe, Steven D. Wrench and Claw. Seattle: Amazon Digital Services, 2011. ASIN B005JPZ74A.
In the conclusion of the author's Honor Bound Honor Born (May 2014), an explorer on the Moon discovers something that just shouldn't be there, which calls into question the history of the Earth and Moon and humanity's place in it. This short novel (or novella—it's 81 pages in a print edition) explores how that anomaly came to be and presents a brilliantly sketched alternative history which reminds the reader just how little we really know about the vast expanses of time which preceded our own species' appearance on the cosmic stage.

Vesquith is an Army lieutenant assigned to a base on the Moon. The base is devoted to research, exploration, and development of lunar resources to expand the presence on the Moon, but more recently has become a key asset in Earth's defence, as its Lunar Observation Post (LOP) allows monitoring the inner solar system. This has become crucial since the Martian colony, founded with high hopes, has come under the domination of self-proclaimed “King” Rornak, whose religious fanatics infiltrated the settlement and now threaten the Earth with an arsenal of nuclear weapons they have somehow obtained and are using to divert asteroids to exploit their resources for the development of Mars.

Independently, Bob, a field paleontologist whose expedition is running short of funds, is enduring a fundraising lecture at a Denver museum by a Dr Dietlief, a crowd-pleasing science populariser who regales his audiences with illustrations of how little we really know about the Earth's past, stretching for vast expanses of time compared to that since the emergence of modern humans, and wild speculations about what might have come and gone during those aeons, including the rise and fall of advanced technological civilisations whose works may have disappeared without a trace in a million years or so after their demise due to corrosion, erosion, and the incessant shifting of the continents and recycling of the Earth's surface. How do we know that, somewhere beneath our feet, yet to be discovered by paleontologists who probably wouldn't understand what they'd found, lies “something like a crescent wrench clutched in a claw?” Dietlief suggests that even if paleontologists came across what remained of such evidence after dozens of millions of years they'd probably not recognise it because they weren't looking for such a thing and didn't have the specialised equipment needed to detect it.

On the Moon, Vesquith and his crew return to base to find it has been attacked, presumably by an advance party from Mars, wiping out a detachment of Amphibious Marines sent to guard the LOP and disabling it, rendering Earth blind to attack from Mars. The survivors must improvise with the few resources remaining from the attack to meet their needs, try to restore communications with Earth to warn of a possible attack and request a rescue mission, and defend against possible additional assaults on their base. This is put to the test when another contingent of invaders arrives to put the base permanently out of commission and open the way for a general attack on Earth.

Bob, meanwhile, thanks to funds raised by Dr Dietlief's lecture, has been able to extend his fieldwork, add some assistants, and equip his on-site lab with some new analytic equipment….

This is a brilliant story which rewrites the history of the Earth and sets the stage for the second volume in the Earth Rise series, Honor Bound Honor Born. There is so much going on and so many surprises that I can't really say much more without venturing into spoiler territory, so I won't. The only shortcoming is that, like many self-published works, it stumbles over the humble apostrophe, and particularly its shock troops, the “its/it's” brigade.

During the author's twenty year career at the Los Alamos National Laboratory, he worked on a variety of technologies including nuclear propulsion and applications of nuclear power to space exploration and development. Since the 1980s he has been an advocate of a “power rich” approach to space missions, in particular lunar and Mars bases. The lunar base described in the story implements this strategy, but it's not central to the story and doesn't intrude upon the adventure.

This book is presently available only in a Kindle edition, which is free for Kindle Unlimited subscribers.

 Permalink

Smyth, Henry D. Atomic Energy for Military Purposes. Stanford, CA, Stanford University Press, [1945] 1990. ISBN 978-0-8047-1722-9.
This document was released to the general public by the United States War Department on August 12th, 1945, just days after nuclear weapons had been dropped on Japan (Hiroshima on August 6th and Nagasaki on August 9th). The author, Prof. Henry D. Smyth of Princeton University, had worked on the Manhattan Project since early 1941, was involved in a variety of theoretical and practical aspects of the effort, and possessed security clearances which gave him access to all of the laboratories and production facilities involved in the project. In May, 1944, Smyth, who had suggested such a publication, was given the go ahead by the Manhattan Project's Military Policy Committee to prepare an unclassified summary of the bomb project. This would have a dual purpose: to disclose to citizens and taxpayers what had been done on their behalf, and to provide scientists and engineers involved in the project a guide to what they could discuss openly in the postwar period: if it was in the “Smyth Report” (as it came to be called), it was public information, otherwise mum's the word.

The report is a both an introduction to the physics underlying nuclear fission and its use in both steady-state reactors and explosives, production of fissile material (both separation of reactive Uranium-235 from the much more abundant Uranium-238 and production of Plutonium-239 in nuclear reactors), and the administrative history and structure of the project. Viewed as a historical document, the report is as interesting in what it left out as what was disclosed. Essentially none of the key details discovered and developed by the Manhattan Project which might be of use to aspiring bomb makers appear here. The key pieces of information which were not known to interested physicists in 1940 before the curtain of secrecy descended upon anything related to nuclear fission were inherently disclosed by the very fact that a fission bomb had been built, detonated, and produced a very large explosive yield.

  • It was possible to achieve a fast fission reaction with substantial explosive yield.
  • It was possible to prepare a sufficient quantity of fissile material (uranium or plutonium) to build a bomb.
  • The critical mass required by a bomb was within the range which could be produced by a country with the industrial resources of the United States and small enough that it could be delivered by an aircraft.

None of these were known at the outset of the Manhattan Project (which is why it was such a gamble to undertake it), but after the first bombs were used, they were apparent to anybody who was interested, most definitely including the Soviet Union (who, unbeknownst to Smyth and the political and military leaders of the Manhattan Project, already had the blueprints for the Trinity bomb and extensive information on all aspects of the project from their spies.)

Things never disclosed in the Smyth Report include the critical masses of uranium and plutonium, the problem of contamination of reactor-produced plutonium with the Plutonium-240 isotope and the consequent impossibility of using a gun-type design with plutonium, the technique of implosion and the technologies required to achieve it such as explosive lenses and pulsed power detonators (indeed, the word “implosion” appears nowhere in the document), and the chemical processes used to separate plutonium from uranium and fission products irradiated in a production reactor. In many places, it is explicitly said that military security prevents discussion of aspects of the project, but in others nasty surprises which tremendously complicated the effort are simply not mentioned—left for others wishing to follow in its path to discover for themselves.

Reading the first part of the report, you get the sense that it had not yet been decided whether to disclose the existence or scale of the Los Alamos operation. Only toward the end of the work is Los Alamos named and the facilities and tasks undertaken there described. The bulk of the report was clearly written before the Trinity test of the plutonium bomb on July 16, 1945. It is described in an appendix which reproduces verbatim the War Department press release describing the test, which was only issued after the bombs were used on Japan.

This document is of historical interest only. If you're interested in the history of the Manhattan Project and the design of the first fission bombs, more recent works such as Richard Rhodes' The Making of the Atomic Bomb are much better sources. For those aware of the scope and details of the wartime bomb project, the Smyth report is an interesting look at what those responsible for it felt comfortable disclosing and what they wished to continue to keep secret. The forward by General Leslie R. Groves reminds readers that “Persons disclosing or securing additional information by any means whatsoever without authorization are subject to severe penalties under the Espionage Act.”

I read a Kindle edition from another publisher which is much less expensive than the Stanford paperback but contains a substantial number of typographical errors probably introduced by scanning a paper source document with inadequate subsequent copy editing.

 Permalink