bobeager.uk Anecdotes

This is a collection of (sometimes) mildly amusing anecdotes. You may already have tired of hearing these from me, or they may be new to you. They are all true. Some have technical content, some don't. This page will grow over time as my memory manages to retrieve archived information.

Technical stuff

These have rather more technical content than the ones in the following section.

The phantom job

In my first year as an undergraduate, we were taught BASIC - it was the only language available online, as opposed to in 'batch' mode via punched cards and printer (I ended up teaching that course myself some years later).

After a few weeks of BASIC, I decided to learn assembly language for the mainframe - an ICL 4130, which was a 24 bit word oriented machine with 96kW of memory. In practice, the best type of target program was a 'subsystem' for KOS, the online facility. Users had only 1536 words of working memory (excluding the code itself), and about 600 of those were used for the I/O system and other services. Nevertheless, I wrote a simple linear regression program (mirroring the one we had done in BASIC for an assessment).

Running a subsystem was something for which one had to get permission, as a fault could stop the online system (although a simple operator command would continue it from the stop point). Once I had proved myself, I was allowed to write more programs.

However, I (and others) became increasingly ambitious, and did some naughty things. Fairly early on, I 'acquired' a copy of the (assembler) source code for the online system, KOS. My first venture into the 'naughty' area was to write a program that would simulate the effect of Ctrl-C (well, its equivalent in those days) on a specified terminal. I was able to do this by modifying a status bit in the terminal multiplexer device driver (I had previously found a way to write anywhere in the entire machine's memory, subverting the memory management). Later, I discovered that I could remotely log someone out as well.

Various other programs followed, and a few of us decided it would be a good idea to be able to submit batch work to the batch queue via a terminal, instead of on punched cards. One could then use other languages such as Algol and FORTRAN, which were not available in the on-line system.

This wasn't too difficult:

  1. Load a program that inserted itself between the executive card reader driver and the batch system.
  2. Fulfil requests for 'next card' by getting one from the card reader driver and passing it on - until an end of job card was detected.
  3. Pass on the end of job card, but follow it with card images taken from a previously prepared file.
  4. At the end of that file, reconnect the batch system to the card reader driver and exit the program.

It worked perfectly. Except ... when the computer operators came to collect up all the punched cards for the completed jobs, and reconcile them with their associated printouts ... there were no cards for one lot of printout! They spent some time looking for them, and to avoid further suspicion, action was taken. Cards were quickly typed for the 'phantom' job, and dropped down the back of the table that held the trays of returned cards. Then: "Oh look, what's that down there?".

We got away with that one. Over time we encountered restrictions on running subsystems, and had to write a loader. That is another story.

Hacking the CPU

This isn't really an amusing one, but people seem interested, so here it is.

I studied Electronics as an undergraduate. The degree programme wasn't quite what I expected (it leaned more towards pure electronics than electronic engineering), and I found it had rather too much physics in it, and not enough emphasis on the practical stuff; I think the problem was the inbuilt assumption that I'd have been playing with stuff (and designing it) for quite a while before starting the degree.

Two things happened. One was that I soon became interested in all things related to computers, and the second was that I found I could (sort of) do digital electronics OK (much easier!). I learned a lot of stuff about various systems, in particular our mainframe (see previous item), which ran VMs-within-VMs, effectively.

We had to attend a compulsory Long Vacation course at the end of our second year, doing something useful at university for three weeks over the summer. I was involved in designing and building a plotter interface, which was a spectacular disaster but did teach me what not to do.

My final year project was unusual in that it was a joint one with another student; he ended up doing the actual building part (including PCB design), and I did the logic design and the software (hardware tests). The project was to modify the CPU of a Honeywell DDP-516 minicomputer, which was fitted with something called a memory lockout option (MLO).

I had realised that the MLO provided insufficient facilities and controls to provide a proper virtual machine (as a hardware emulation). It did provide something called restricted mode, which stopped certain instructions (e.g. CPU halt) from being executed. The project was intended to fix this. It was made easier on a practical level by the fact that this was a set of wire wrapped backplanes containing multiple, rather small, circuit boards. There were some spare, uncommitted board slots, which were wired and used for this.

This is not the place to go into a lot of detail, but essentially the modifications were:

  • In restricted mode, some instructions were silently treated as a no-op (NOP) instruction, rather than causing an interrupt to the operating system. This meant that instruction emulation inside a virtual machine was impossible. This was fixed by the addition of extra logic to cause an interrupt in this situation.
  • It was also not possible to emulate restricted mode inside a virtual machine, because the instruction to enter restricted mode (ERM) was treated as a no-op if already in restricted mode. This was fixed by the addition of logic to force an interrupt.
  • This wasn't strictly necessary but made life easier. When an interrupt occurred, it wasn't possible to tell if the interrupt had come from restricted mode, or from normal mode. A latch was added to preserve the previous mode. There was an instruction to save machine flags, etc. on an interrupt; it was, strangely, called 'input keys' (INK). It simply copied the flags to the accumulator so that the operating system could store them somewhere. INK was modified to save the state of the 'previous mode' latch in an unused bit. Note that the 'output keys' instruction (OTK) was naturally not modified, as all it could do would be to write back into the latch!
  • I am pretty sure there was other stuff, too, but I can't remember right now.
  • And of course I wrote a set of hardware test programs.

Crashing the system by deleting a file

This was another escapade on the ICL 4130 running KOS. By this time I was a (supposedly responsible) postgraduate.

The 4130 was running out of disk space, but it was nearing the end of its 10 year funded life. It had four 2MB disks, but needed more; however, it was not cost effective to buy more disks, and, I believe, an extra disk controller.

Two members of staff (one being Brian Spratt, the Director of the Computing Laboratory) thought up a cunning plan for cheap disk space. This used a PDP-11 as a kind of file server. One of the staff built a hardware interface between the 4130 and the PDP-11, and the other wrote the link software. There was also a little extra software in the PDP-11.

The basic idea was that the PDP-11 appeared as an extra disk - the current disks had single digit numbers, and the disk on the PDP-11 became disk 99. The PDP-11 ran its manufacturers' operating system - a pretty basic one called DOS/BATCH. Filenames on the 4130 were a maximum of eight characters, whereas they were 9 characters on the PDP-11. One could thus directly map filenames from 36 users on the 4130 to a single user on the PDP-11, by using the extra letter or digit to differentiate files for different users (there were a limited number of user accounts in DOS/BATCH).

This all worked surprisingly well. The disk on the PDP-11 was an RP02, which was 20 megabytes; this was a vast improvement. A second disk was added later.

Until I came along! One day, I had written a program to do something pretty innocuous; I forget what it was. I accidentally got it into a loop writing to a file, and managed to fill up the rest of the 20 megabytes. I realised what I'd done, so I simply deleted the rather large output file. This would have been OK, but ...

DOS/BATCH used the system of 'block chaining' to construct files in its filing system. Essentially, files were linked lists of blocks, with a bitmap or a free list recording free blocks. When I deleted my large file, its deletion involved laboriously crawling down the very long chain of blocks in the file, returning each one and marking it as free. This took a long time - so long, in fact that the 4130 thought the PDP-11 had crashed. The software was very simple - it did the easiest thing - it halted the machine.

So I brought the University mainframe to a standstill by deleting a file.

Crashing the system by editing a file

This happened again on the ICL 4130 running KOS.

I was doing research on software portability, in which I had been interested for some time. I had obtained a portable editor from a postgraduate at the University of Essex, and had implemented it on KOS. It worked in very limited memory (essential on KOS) but had advanced looping and decision constructs which made it very powerful. A select group of people (including me, of course) used it a lot.

KOS was simply a layer on top of the manufacturer's operating system; as such, it had to deal with unexpected error returns from the system. I general, these did not happen very much at all. For development purposes (and KOS was being developed continually), any unexpected error would cause KOS to stop scheduling its timeshared users, print the message LOGICAL ERROR on the operator's console, and pause for operator input. A simple command would allow it to continue, but of course the error had to be investigated first.

My portable editor just occasionally caused a logical error. I tried my best but could never find the fault. Then, one morning, I managed to cause four logical errors. The system manager wasn't happy, and he printed an octal dump of the entire KOS slave (we would call it a virtual machine these days). This was on 11 inch by 8 inch paper, quite thin, and a pile about a foot thick. He dumped it on my desk, with the order "Fix it!"

I took the pile back to my Darwin study bedroom, and left it on the floor for several days. On the Saturday evening, I and several other postgrads gathered in Darwin Bar, and I had quite a lot to drink. At closing time, I staggered back to my room, not at all sleepy. I assume I said to myself, no doubt in a slurred voice: "Ah, fix the editor!"

Apparently, I did so. I have no more recollection of that night, but I woke up the next morning to find paper all over the floor. On the top sheet was written "Uninitialised variable in fourth word of VFILE control block". And so it was.

Fooling the managers

The University of Kent's ICL 2960 was reasonably reliable - rather more so after we retired VME/K (the manufacturer-supplied operating system) in favour of EMAS, an operating system from the University of Edinburgh. The main points of hardware failure seemed to be fans and power supply units.

When the machine did break down, it caused great disruption to classes, and these were not easily rescheduled. We were thus under great pressure to get back in operation as soon as possible. We had invaluable assistance in this from our site engineers, in particular a lovely man called Harry Sweet, who lived in Herne Bay.

One of the most frustrating things was the time it took a spare PSU to reach us, even with two engineers doing a halfway meet (the trials of being in deepest East Kent). So we had a cunning plan; we kept unofficial on-site spares - unknown to the engineers' manager, who, by judicious use of smoke and mirrors, was persuaded (unwittingly) to provide several spare units, at least one of each kind.

The problem was where to store them. They had to be accessible to the engineers, but could not be kept in the room provided for them, because their manager might have noticed. Instead, they were stored under the false floor in the machine room, scattered in various empty spaces.

Of course, there was then the problem of finding the right unit without lifting half the floor. This was solved by the production of a 'treasure map', the grid corresponding to the floor tile layout. The map was taped to the back of a drawer in the engineers' room...well away from management eyes.

We still found a couple of mislaid PSUs when the machine was decommissioned.

The engineer doesn't always know best

When the University of Kent's ICL 2960 mainframe was installed, it came with a site engineer. For quite a while, one of these was someone who was a Kent graduate. He was somewhat of a 'company man', and was not keen when we abandoned VME/K in favour of EMAS (from the University of Edinburgh).

I managed EMAS; it had a novel way of handling filestore and (for the purposes of this story) peripherals such as printers. These were managed via a Spooler process, which handled all of the exception conditions, farmed out to it by the actual supervisor. Whoever wrote the code at Edinburgh had been a little obsessive about detailed error messages - a good thing, and possible because all of the messages were inside a paged process.

One day, we saw a message we had never seen before. I forget the exact text, but it indicated that a particular fuse had blown in the printer. We duly called the engineer from his room. He looked at the message, and shook his head, stating that no such fuse existed and "our" system was wrong.

We pressed him on this, and after casting his eye over the defunct printer he retired to his office and manuals. He returned a few minutes later, bearing a fuse. He silently opened a small panel in the printer casing, and changed the fuse.

Hacking the hardware

Once the University of Kent had moved to using EMAS, it was enjoying a rolling MTBF averaging about 2000 hours over a 13 week period. This was much better than the 20 hours we had been getting from VME/K. People were very happy.

And then one day it all began to fall apart. The machine just stopped. No crash, nothing. The engineer's panel indicated that the microcode had halted. We re-IPLed the system, and an hour or two later it stopped again. Eventually we called the engineers, and they ran tests. Lots of them. They pronounced that there was nothing wrong.

Then the 'crashes' stopped, for a couple of weeks. Then they started again. We couldn't get a handle on what was wrong at all. It was eventually decided that, the next time it happened, I should use the engineer's panel, for as long as it took, to investigate the state of the machine. In the event, I simply dumped out all the target machine registers, and the microcode PC.

Our engineers obligingly left a microcode training manual lying around, together with a microfiche listing of the microcode. Oh, and some circuit diagrams. I retired to a darkened room for much of that day; and the next. Eventually I emerged with the reason for the crashes. Without going into too much technical detail, it seemed that the microcode and the hardware handed off tasks to each other; in particular, a part of the hardware called the 'scheduler' was responsible for validating the type field in the descriptor register during the execution of any instruction that used a descriptor to access an operand. Any invalid type was trapped, and sent back to the microcode to force an exception (known as a 'contingency'). All other type values were considered valid, and passed back to the microcode to be used in accessing a jump table, thence invoking the right bit of microcode for that descriptor type.

So, what was going wrong? It turned out that there was what can only be described as a hardware design error. The scheduler didn't detect one particular invalid type code, so it handed it back to the microcode, which accessed the jump table with it. This of course accessed an entry marked 'can never happen', and the microcode halted. We later discovered that a physicist's errant FORTRAN program was overwriting a descriptor, and generating the bad type value. If the machine stopped, he just submitted the job again until he got fed up and went off for a week or two. Then he tried again, never noticing the causal connection.

We contacted ICL, but we never seemed to reach anyone who either understood what the problem was, or had the power or inclination to get it fixed (which would not have been a quick job, in any case).

So I decided I had better fix this another way. Back to the microcode listing. I found an empty patch area, and hand assembled a new bit of microcode which I linked to the right jump table entry. All this did was generate a 'descriptor error' contingency with a hitherto unused subtype code. I then wrote a tool to extract the microcode from the system disk, patch it, and put it back again. We IPLed the system, and tested it (by this time I had a test program). Success - it correctly triggered the new contingency and the microcode didn't halt!

The only thing left to do was to modify the various components of the operating system to do the right thing, culminating in a change to the FORTRAN run-time system to generate a suitable message. That only took me a few minutes.

We had no more microcode halts and the users were happy.

Dual? What dual?

The University of Kent's ICL 2960 was installed in 1976, and it moved to the EMAS operating system in 1979. EMAS was very efficient, but as they years went on the system was being stretched to its limits. By 1983 the system was fully committed pretty well 24/7. Government policy meant that we wouldn't get a replacement for another three years.

We knew we couldn't afford much of an upgrade, but we found out that there was a spare ICL 2960 OCP lying in a warehouse in Southall (I believe it had been used for the recent Census). It was free to a good home (us) but we had to pay about £350 for transport, etc. ICL kindly suppied the extra bits we needed to hook it up, and by slightly reducing the peripheral configuration (we no longer needed a card reader) we were pretty well able to cover maintenance costs within budget.

The day came, and we IPLed the dual system for the first time. EMAS said 'Dual OCP found' and went to work. Basically, it worked until anything went wrong, but it turned out that under exception conditions the operating system was unable properly to control the second OCP (e.g. to halt it). EMAS had never before been run on a dual 2960 OCP (there wasn't one at Edinburgh), and it turned out that the instructions and image store locations needed to communicate between OCPs were not standard across the 2900 range.

We asked ICL for documentation. No one knew where it could be found (we assume), or perhaps someone decided we shouldn't have it. In any case, we were stuck. Without documentation we couldn't modify the system supervisor to make dual OCPs work as they should.

I had previously learned quite a bit about the ICL 2960 microcode, so I retired once again to a darkened room with the microcode training manual, and a microfiche reader. It took me about a day before I emerged, having read a great deal of microcode and essentially reverse engineered all of the image store locations and bit positions needed to do what we needed; I think it was quite short, and here is most, if not all, of it. Armed with this, it was the work of minutes to modify the supervisor, rebuild it and re-IPL.

The system worked very well for its final three years.

Immortalising Bobby Tables

Many Computer Science students will know about Bobby Tables. He appears in this xkcd cartoon. If you don't understand it, then take a look at the explanation.

The University of Kent has a Footsteps Project, which aims to raise money by getting former students to donate chunks of money in return for having a brick laid in a memorial path. Bricks can have any wording required, as long as enough money is provided!

If one counts from the end of the path nearest the Gulbenkian Theatre, I think the Bobby Tables brick is in the sixth row, right hand side...

See also the piece about the misaligned path later.

Rather less technical stuff

These should be accessible to non computing people!

When I can remember them. And if the period specified by any statute of limitations has elapsed.

Other stuff

This is stuff that doesn't fit well into a category.

The misaligned path

This refers to the brick path built for the University of Kent's Footsteps Project. This aims to raise money by getting former students to donate chunks of money in return for having a brick laid in a memorial path. Bricks can have any wording required, as long as enough money is provided!

Although it doesn't say so any longer, the path is meant to follow the path of the Canterbury and Whitstable railway line, which ran in a tunnel beneath the University. That tunnel collapsed in 1974, causing consternation and inconvenience to many University members, not least the Computing Laboratory (the main computer had to be moved in a hurry). As a nod to the railway line motif, the brick path is bordered by steel rails.

However, the path doesn't follow the railway line - it is quite a few degrees out. Looking from the Gulbenkian Theatre, it should point a bit further to the right.

You can see this for yourself. View this map. It shows the railway line as a double dotted line (the tunnel) underneath what later became the site of the University. In the panel on the left, slowly drag the blue dot to the left, and a modern Google Maps overlay will fade in. You can see the line of the path - it's different!

Never mind. It's probably very sad that I noticed in the first place.

Right Said Fred

If you don't understand the title by the time you've finished this, look here.

At the University of Kent, the School of Computing occupies, inter alia, the first floor of the Cornwallis South building (and a bit of the ground floor). The rest of the ground floor is occupied by Information Services (who provide the central computing service). Prior to the early 1990s, these two entities were one and the same - the Computing Laboratory.

It will be noticed that the ground floor (east of the main entrance and lobby) consists of offices and corridors that surround a central, windowless set of rooms. The largest of these was, at one time, the main Computer Room (!) which contained the University's mainframe computer; it is now the main server room. There used to be windows so that visitors could be shown The Computer; these were on the south side, and if you go into the main entrance, turn right through the door, and walk a little way down the corridor, you can (if you squint, and in the right light) see where the windows were filled in. Originally, this central complex of rooms was completely rectangular.

In the mid 1970s, the University obtained funding for its ICL 2960 replacement mainframe computer. ICL were very good at site surveying (well, most of the time, but that's another story). The site surveyors came, and decreed that there were a number of problems. The most pressing one was building access; the system's OCP was contained in a long cabinet nearly a metre wide and several metres long. It would go in the front door. It could be rotated to go through the door on the right. It would not negotiate the slight dogleg necessary to go through the door and lobby on the corner of the Computer Room.

The solution was to modify the wall. Which is why, to this day, that corner of the server room is cut away a bit.

Glossary

It occurred to me that some of the items above might need to have explanations of one or two of the terms. So here they are.

EMAS
The Edinburgh Multi-Access System. EMAS was an operating system written at the University of Edinburgh, originally for the English Electric System 4 (a near clone of the IBM 360/370 series). It was ported to the ICL 2900 series, and later on to the newer IBM mainframes. For more details, see the Wikipedia article.
ICL 2960
The ICL 2960 was manufactured by International Computers Limited, the British computer company at the time (early 1970s). ICL was the result of serial mergers of a number of companies; for full details, see the Wikipedia article. The 2960 was part of ICL's 'New Range' of computers, intended to replace the ageing System 4 (inherited from English Electric), the 1900 series (inherited from International Computers and Tabulators), and the 4100 series (inherited from Elliott Automation). The machine had a 32 bit (4 byte) word, and some sophisticated facilities, with intelligent peripheral controllers.
ICL 4130
The ICL 4130 was manufactured by International Computers Limited, the British computer company at the time (early 1970s). ICL was the result of serial mergers of a number of companies; for full details, see the Wikipedia article. The 4130 was part of the 4100 series (the only other model being the more basic 4120). The 4130 had a 24 bit word, some interesting shared code features, and some simple base and range style memory management, together with the ability to operate in privileged and non-privileged modes.
Image store
The ICL 2900 architecture included instructions for accessing image store. These instructions were quite restricted (basically just 'read' and 'write'). The image store looked like a special kind of memory, but changes in its contents were used to modify the behaviour of the machine, and for accessing peripheral controllers. Image store instructions can be thought of as a mixture of access to machine control registers, and as input/output instructions (with image store addresses being analogous to input/output port numbers).
IPL
Initial Program Load. The act of loading a small program into a computer, which then loads a larger one, and eventually the operating system. More commonly known as bootstrapping or booting.
KOS
The Kent On-line System. KOS was a simple timesharing system allowing small applications, simple programming (mainly in BASIC), and editing of disk files. It ran alongside a conventional 'punched card and printer' batch system, on the ICL 4130 series of machines.
MTBF
Mean Time Between Failure.
OCP
Order Code Processor. Machines in the ICL 2900 series were built from a series of components; usually, each was in a separate cabinet (or several cabinets in many case). The OCP was what is normally called a CPU - it was the component that executed instructions (the order code), and did not include peripheral controllers, memory, etc.
PDP-11
A rangle of 16 bit minicomputers manufactured by the Digital Equipment Corporation of Maynard, Massachusetts. The company name was usually abbreviated to DEC. They were relatively inexpensive (for the time), flexible and easy to interface.
PSU
Power Supply Unit. Power supply units usually convert mains power into a form suitable for powering computer components. Large computers may have several different ones, powering different components.
VME/K
Virtual Machine Environment K. VME/K was an operating system for smaller machines in the ICL 2900 series. It was introduced as an alternative to the mainstream operating system, VME/B, because the latter was too big and slow on the smaller machines. However, its life was short, and it was retired in the early 1980s due to cost cutting and internal politics.

Valid XHTML 1.0! Valid CSS!

This site is copyright © 2017 Bob Eager
Last updated: 06 Aug 2017