Su Tech Ennui: This is about me.

Wednesday, September 19, 2007

This is about me.

There is a generation of computer programmers who at the time of writing are aged around 45 to 65 years, who learned computing during a wonderful period when everything was new and one person could learn all there was to know about a computer system from the bottom up.

I consider myself one of that generation. I wasn't an exemplary student - I graduated with a lower second, and I could name half a dozen of my contemporaries who I considered good programmers rather than average ones like myself. But now I look back and realise that the merely average graduate of 1980 was a programming god compared to most of today's graduates.

So this is about me. This isn't one of those tactfully-written third-party expositions on the state of education that you'll find published in an Educause journal. This is one guy's personal experience and my view of what is wrong with Computer Science education today.

Typically, a student of my generation would be able to write an assembler by the end of our first year of study; a compiler by second year; boot a stand-alone executable with a run-time library that is accessed through kernel calls by the end of third year; and write a usable multi-tasking operating system by graduation at the end of fourth year. During this process we would have learned about hardware from the transistor level up and be intimately familiar with at least one architecture at the machine instruction level. We would not only have expertise in one programming language, but would have a working familiarity with half a dozen others (and as many machine architectures), and would have designed and implemented a language of our own at least once!

By graduation we would also have a good grasp of data structures and algorithmic complexity, and understand when and where it is appropriate to optimise code and when it is not worth the effort; we would have worked in group projects to build systems of significant size and they would have a sound grasp of software engineering principles.

Computer education in recent years appears to have lost that focus on learning the fundamentals. Systems are so complex that it is impossible to know what is going on inside. How many graduates could tell you exactly the steps that a Windows or Linux system goes through to load a partially linked executable file and start it running? In the 70's the answer would have been 'all of them', whether the system was a stand-alone mini or the University's mainframe with a home-grown O/S on it. Today the answer would be 'damned few'.

Nowadays we don't teach students how to build a raster display from scratch and program their own rasterops library - we give them a nice regular virtual image store and a copy of the SDL manual if they're lucky. A compiler class will show students how to tweak GCC to add some new trick, but it won't teach them to write a compiler from first principles. And bootstrap it using an assembler and a macro processor. I don't believe compiler writing has significantly improved since the 1970's when it was a high art, and a really good compiler could be written by one person in a year and a classroom-level one in a couple of weeks. Now it takes a team to write a compiler and it is so complex that the writers trip over each others feet.

Programmers today have much knowlege but little understanding. They use large packages and build complex systems, but they don't create the underlying architectures or have a gut-level understanding of how everything works. We are turning out IT consumers, not developers. Quiche eaters, not Real Programmers.

Let me give an example from someone I know. I don't mean to cause offence because he's a friend, but like so many of his peers he goes home after work and switches off the Computer Science part of his mind. I'ld be amazed if he reads tech blogs like this and would stumble across this description.

I once asked my friend what he programmed for fun after he went home from work. After being blown away to learn that he didn't program - at all - I asked what his best project was at University, where he'd graduated with a BSc in Computer Science. It turns out that his crowning acheivement, the pinnacle of his student years, and his big final-year project, was ... drum roll ... to write a reverse polish calculator. In fact he said he was one of only two in his class to succeed on the project.

I want you to think about that for a minute. A reverse-polish calculator.

So, no recursive-descent expression parser. No handling of operator precedence. No compiled code. No writing of the low-level floating point library. In fact I doubt very much if he even understood how floating point is represented. And he was proud of it!

These are the people who are graduating with a Computer Science degree nowadays. These are the people who are being hired to work in Computer Centers without ever being asked to write a line of code in their interviews.

Now, before I'm lambasted by Caltech and MIT and Stanford and UCLA graduates, I know full well there are still good Universities turning out competant students, but I also know that there are far more mediocre Universities turning out sub-standard students, ill equipped to survive in the real world, and so damaged by their Higher Education that they have no hope of recovering.

The good ones are few and far between.

I would like to suggest that it is time to re-examine computer science education, and take a long-term look back at what worked and what didn't, from the early days of computing to the present, and then prepare a set of curriculum materials that will give current students that same thrill of building it all themselves from the bottom up, while giving them a solid grounding in Computer Science that will see them through a lifetime. Fads and whims come and go. Languages come and go. Fundamentals always remain.

Specifically, I suggest not only creating teaching materials, but also a software environment based on emulation which will recreate the joy of learning about a machine and developing all the software for it from scratch. This will be a consistent program which can be followed from beginner level through to graduation, but modularised so that any step (such as the compiler-writing module) can be taken for stand-alone use by any university or interested learner without requiring to have completed the previous modules (as long as the student had an equivalent level of experience.)

This material, including the source code to all the development environments for the practicals, which would be the bulk of the learning experience, would be made available via the Web so that it could be adopted by Universities, and also picked up directly by programmers who want to learn but aren't being given the opportunity because their classes are all about Java and Ruby on Rails.

An example of the sort of practicals I envision include (not necessarily in chronological order):

  • Learn text editing with an old-school programmable text editor - not one that gives you RSI if you need to make the same sort of change to 100 different lines.
  • Learn to program - the basics
  • Algorithms, and algorithmic complexity from practical examples. Include some fun stuff like decrypting a 1:1 cypher - i.e. problem-solving exercises and not just programming exercises.
  • Machine architecture and instruction sets
  • Write a disassembler
  • Write an assembler
  • Write a trivial compiler
  • Learn what parsing is for, and design and implement a parser from first principles before reading any textbooks on how it is normally done. Then rewrite your parser twice, using two different methods of parsing
  • Write a macro processor or source to source language translator
  • Study more complex data structures and algorithms (crossover from theory classes) such as DAGs and graph minimisation (useful in advanced compiler writing)
  • Write a usable compiler
  • Design, implement, *and document* their own programming language (reusing their compiler project)
  • Write a binary to source decompiler
  • Write a stand-alone operating system (boot loader, basic I/O library)
  • Write a multi-tasking operating system, incl scheduler, for some affordable real hardware such as a Gameboy Advance
  • Write a text editor compatible with the one they started using when they first started programming
  • Write a screen editor and adding programmable feature set
  • Study graphics hardware architecture
  • Write a low-level graphics package (rasterops etc)
  • Write a higher-level graphics package (2D, 3D line-based, matrix transforms)
  • Write 3D rendering/shading code
  • Write a Scrabble game (concentrate on the engine, not the graphics). No lookahead needed for this one, it's mostly about data structures and algorithms for text manipulation.
  • Write an adversarial game (such as checkers or poker) which requires lookahead and pruning with algorithms like Minimax, NegaScout etc.
  • Write a video game (old school, pacman style, raw hardware) running on the O/S you wrote earlier, that relies on real-time cooperating processes
  • Write a video game (new school, multiplayer, rendered) that runs on the graphics library you wrote earlier.
  • A cooperative project where an interface has to be defined and one student or group depends on another to work to the interface. (For example, one group writes a first pass of a compiler that generates intermediate code; the other group writes a code generator for that intermediate code to a target architecture. Working examples to be supplied in case one group fails - but working example to be written deliberately poorly so it is not worth stealing! And as an example for the students to improve on.)
  • Implement a database from scratch. Write an application-specific search engine to exercise database lookups
  • Add persistence as a data attribute to a compiler
  • Implement a free-text search engine
  • Write a complete protocol handler (eg X25 Level 2)
  • Write a terminal emulator
  • Add multi-user terminals to multi-tasking operating system
  • Write a file server client
  • Write a file server

I've personally done most of the above as an undergrad or for amusement later. I've produced teaching materials that have been used successfully by self-motivated students via the Internet - the Static Binary Translation HOWTO - and with some other experienced programmers of a certain age helped motivated youngsters write their first compiler on the Yahoo group "Compilers 101". I strongly believer that Computer Science is fun and that the subject practically teaches itself if the exercises are sufficiently interesting. I also believe that if someone struggles with programming, it's obvious by the end of their first year, and they should go and find some other field that they're good at, and not burden the world with another unwanted mediocre programmer.

The list of practical projects given as an example above is more than I can expect any Computer Science course to take on nowadays and certainly more than I can turn out myself. I would judge that preparing the curriculum materials and test beds for a set of exercises such as the list above would take approximately three people and two years. Perhaps there are some retired lecturers/professors from the 70's who still feel they have a calling to pass on their knowlege who would want to work on this. The materials that would be created by a project like this could be used in 'pick and mix' fashion, not just in classes but by self-motivated learners teaching themselves on the internet. It should however have a long life because the fundamentals are essentially timeless. This teaching material would not become outdated due to new application libraries, new operating systems or new hardware.

The specific examples above are taken from my personal experience and my education at a good University that was one of the first to teach Computer Science - by the people who were at the leading edge of research in the field. I returned to my Alma Mater earlier this year when I organised a conference on the subject of Computer History and coupled it with a reunion of people who had learned their craft in the 60's and 70's. The sentiments I've described above about the decline of Computer Science education were shared by many of the participants, among them successful leaders of industry, and academics from a broad range of institutions.

It may be that the politics of education nowadays - more interested in greater student numbers in order to get tuition income, than in better student quality - will work against a movement that wants a revival in the teaching of fundamental computer science. Perhaps the complexity theorists have won, and good engineering will be forever delegated to a back seat role, but I live in hope that it is not too late to get us back on the right track.


Electropostie said...

I read a paper today that highlights some related issues.

Electropostie said...

Found an even better article on the decline and fall of (Computer Science in) the British University. Quite scathing.

Denny said...

I wish the something similar would be done for hardware instruction as I recently was called a liar when I explained to some younger hardware techs that the man that taught me started with CORE memory systems at a bank in the sixties, was a wizard with the SCO's XENIX and could spin a screw with zip. Heck, try to tell them that utility bills used to come on punch-cards and a blank look and the question what is a punch card is asked...

G said...

See this article, Revenge of the Frosh-seeking Robots, from The American.

- "But the Research staff put its collective finger on something more fundamental. Computer science, as it is currently taught in U.S. colleges, is suffering an excitement deficit. It is critical to make the first experience students have in computer science compelling and linked to real-world results. Too often, American colleges and universities don’t do that.

The way computer science is taught in colleges and universities, says Bryan Barnett, lead program manager for Microsoft Research, “turns kids off almost immediately.” They typically spend a year learning basic theoretical concepts and code syntax. There’s little concrete to show for such work."

Unfortunately the article's focus is on that group of students who select their career path based on what they see as likely to be in demand when they graduate, and in particular it discusses students going into Economics.

Frankly, I don't think we want those students coming into Computer Science. Good computer scientists do it because they can't imagine doing anything else with their lives except programming - it's an avocation or a calling, not merely a job. The people who deserve to be computer scientists will find their way to it regardless of what track they start on at University, and they'll do well regardless of whether they have good lecturers or not. Great computer scientists are born, not made.

Peter Robertson said...

It's all sad, terrifying, and true.

I am currently building up to explode as my son (16 years old) works his way towards an examination in a school subject called "Higher Computing". This must be a euphemism for something my education didn't cover, as what is presented could hardly be lower in quality and is barely recognisable as “computing”.

The textbook he is using was written by someone who demonstrates a limited understanding of both the subject and teaching, and seems to have produced it more to bolster his ego than present what the pupils need.

It started with the statement that “Mb” means megabytes (it doesn't; it means megabits) and
went downhill from there. Almost every page in the book had something that brought steam to my ears. How, for example, can you justify a section on processors that gives gory details of a selection of varieties of RAM while completely ignoring the concept of a program counter? How does the processor “find the next instruction”? By magic? Divination? Email?

I started listing the things that were wrong, but quickly discovered that I would end up writing a tome that was considerably larger than the book in question.

Some of the teachers even admit that they know they are teaching nonsense, but claim they have to do it, otherwise pupils giving the “correct” answers to examination questions will lose marks. No doubt the markers too are ignorant or are being forced to mark answers blindly.

Don't get me started on my son being told to comment “End If” lines in Visual Basic with “This is the end of an IF statement”!

Here's a quiz taken from a site that pupils use to “help” them learn the subject. The wording may not be exact, but it's near enough to make the point.

“Which of the following is not given when you declare a variable?
A: its type
B: its initial value
C: its name
D: how much memory it needs

Like me, every computer professional I have asked has given the “wrong” answer.
Here's a clue: think of “char x;” or “double y;” in C.

It is too late to do anything for my son (other than teach him properly myself), but I am so inflamed by this that I cannot let it rest. How can we corrupt our children like this? Once his exams are out of the way, I intend making visible the shameful state of computing in schools and doing my utmost to improve how the subject is taught.

Anyone who wants to join me in this will be welcome.

G said...

Robert Dewar and Edmond Schonberg (guys about my age I suspect) have this to say about the role of programming languages in CS education

It echoes what was said some time ago by that indefatiguable pundit of the computer industry, Joel Spolsky.

G said...

I'll probably lose this in my bookmarks so the best place to file it is here. When I finish my free textbook on compiler writing that I'm working on, I'll pass it on to these people among others, for distribution:

While on the subject of stuff worth bookmarking, Sedgewick, the author of several books and papers on Algorithms, is a professor at Princeton, and there's a lot of good lecture material from his classes online at his site:

Electropostie said...

This article at HackZine pointed me at a good CS course in Israel that teaches CS from first principles. (Indeed, that's the title of the >lecturer's book...)

Could this emphasis on the basics be why Intel's chip designers are all Israeli now?

G said...

This link (pdf) replaces the dead link in the first comment.

G said...

I finally got the book "Building a Modern Computer from First Principles" by Noam Nisan and Shimon Schocken and found it a little disappointing. Not /bad/ per se, just not as 'first principles' as I had hoped. It's very light on the hardware and graphics design, and is mostly just another book on how to write an assembler/compiler/intermediate code (aka VM)/interpreter that I think conflates a run time library/bios with an operating system. Even as a basic compiler book I can't take a 'first principles' declaration seriously when it's implemented on technologies such as C++, Java, XML etc etc. Bootstrapping itself through a cut down C might have earned my respect.