Su Tech Ennui: September 2007

Wednesday, September 19, 2007

This is about me.

There is a generation of computer programmers who at the time of writing are aged around 45 to 65 years, who learned computing during a wonderful period when everything was new and one person could learn all there was to know about a computer system from the bottom up.

I consider myself one of that generation. I wasn't an exemplary student - I graduated with a lower second, and I could name half a dozen of my contemporaries who I considered good programmers rather than average ones like myself. But now I look back and realise that the merely average graduate of 1980 was a programming god compared to most of today's graduates.

So this is about me. This isn't one of those tactfully-written third-party expositions on the state of education that you'll find published in an Educause journal. This is one guy's personal experience and my view of what is wrong with Computer Science education today.

Typically, a student of my generation would be able to write an assembler by the end of our first year of study; a compiler by second year; boot a stand-alone executable with a run-time library that is accessed through kernel calls by the end of third year; and write a usable multi-tasking operating system by graduation at the end of fourth year. During this process we would have learned about hardware from the transistor level up and be intimately familiar with at least one architecture at the machine instruction level. We would not only have expertise in one programming language, but would have a working familiarity with half a dozen others (and as many machine architectures), and would have designed and implemented a language of our own at least once!

By graduation we would also have a good grasp of data structures and algorithmic complexity, and understand when and where it is appropriate to optimise code and when it is not worth the effort; we would have worked in group projects to build systems of significant size and they would have a sound grasp of software engineering principles.

Computer education in recent years appears to have lost that focus on learning the fundamentals. Systems are so complex that it is impossible to know what is going on inside. How many graduates could tell you exactly the steps that a Windows or Linux system goes through to load a partially linked executable file and start it running? In the 70's the answer would have been 'all of them', whether the system was a stand-alone mini or the University's mainframe with a home-grown O/S on it. Today the answer would be 'damned few'.

Nowadays we don't teach students how to build a raster display from scratch and program their own rasterops library - we give them a nice regular virtual image store and a copy of the SDL manual if they're lucky. A compiler class will show students how to tweak GCC to add some new trick, but it won't teach them to write a compiler from first principles. And bootstrap it using an assembler and a macro processor. I don't believe compiler writing has significantly improved since the 1970's when it was a high art, and a really good compiler could be written by one person in a year and a classroom-level one in a couple of weeks. Now it takes a team to write a compiler and it is so complex that the writers trip over each others feet.

Programmers today have much knowlege but little understanding. They use large packages and build complex systems, but they don't create the underlying architectures or have a gut-level understanding of how everything works. We are turning out IT consumers, not developers. Quiche eaters, not Real Programmers.

Let me give an example from someone I know. I don't mean to cause offence because he's a friend, but like so many of his peers he goes home after work and switches off the Computer Science part of his mind. I'ld be amazed if he reads tech blogs like this and would stumble across this description.

I once asked my friend what he programmed for fun after he went home from work. After being blown away to learn that he didn't program - at all - I asked what his best project was at University, where he'd graduated with a BSc in Computer Science. It turns out that his crowning acheivement, the pinnacle of his student years, and his big final-year project, was ... drum roll ... to write a reverse polish calculator. In fact he said he was one of only two in his class to succeed on the project.

I want you to think about that for a minute. A reverse-polish calculator.

So, no recursive-descent expression parser. No handling of operator precedence. No compiled code. No writing of the low-level floating point library. In fact I doubt very much if he even understood how floating point is represented. And he was proud of it!

These are the people who are graduating with a Computer Science degree nowadays. These are the people who are being hired to work in Computer Centers without ever being asked to write a line of code in their interviews.

Now, before I'm lambasted by Caltech and MIT and Stanford and UCLA graduates, I know full well there are still good Universities turning out competant students, but I also know that there are far more mediocre Universities turning out sub-standard students, ill equipped to survive in the real world, and so damaged by their Higher Education that they have no hope of recovering.

The good ones are few and far between.

I would like to suggest that it is time to re-examine computer science education, and take a long-term look back at what worked and what didn't, from the early days of computing to the present, and then prepare a set of curriculum materials that will give current students that same thrill of building it all themselves from the bottom up, while giving them a solid grounding in Computer Science that will see them through a lifetime. Fads and whims come and go. Languages come and go. Fundamentals always remain.

Specifically, I suggest not only creating teaching materials, but also a software environment based on emulation which will recreate the joy of learning about a machine and developing all the software for it from scratch. This will be a consistent program which can be followed from beginner level through to graduation, but modularised so that any step (such as the compiler-writing module) can be taken for stand-alone use by any university or interested learner without requiring to have completed the previous modules (as long as the student had an equivalent level of experience.)

This material, including the source code to all the development environments for the practicals, which would be the bulk of the learning experience, would be made available via the Web so that it could be adopted by Universities, and also picked up directly by programmers who want to learn but aren't being given the opportunity because their classes are all about Java and Ruby on Rails.

An example of the sort of practicals I envision include (not necessarily in chronological order):


  • Learn text editing with an old-school programmable text editor - not one that gives you RSI if you need to make the same sort of change to 100 different lines.
  • Learn to program - the basics
  • Algorithms, and algorithmic complexity from practical examples. Include some fun stuff like decrypting a 1:1 cypher - i.e. problem-solving exercises and not just programming exercises.
  • Machine architecture and instruction sets
  • Write a disassembler
  • Write an assembler
  • Write a trivial compiler
  • Learn what parsing is for, and design and implement a parser from first principles before reading any textbooks on how it is normally done. Then rewrite your parser twice, using two different methods of parsing
  • Write a macro processor or source to source language translator
  • Study more complex data structures and algorithms (crossover from theory classes) such as DAGs and graph minimisation (useful in advanced compiler writing)
  • Write a usable compiler
  • Design, implement, *and document* their own programming language (reusing their compiler project)
  • Write a binary to source decompiler
  • Write a stand-alone operating system (boot loader, basic I/O library)
  • Write a multi-tasking operating system, incl scheduler, for some affordable real hardware such as a Gameboy Advance
  • Write a text editor compatible with the one they started using when they first started programming
  • Write a screen editor and adding programmable feature set
  • Study graphics hardware architecture
  • Write a low-level graphics package (rasterops etc)
  • Write a higher-level graphics package (2D, 3D line-based, matrix transforms)
  • Write 3D rendering/shading code
  • Write a Scrabble game (concentrate on the engine, not the graphics). No lookahead needed for this one, it's mostly about data structures and algorithms for text manipulation.
  • Write an adversarial game (such as checkers or poker) which requires lookahead and pruning with algorithms like Minimax, NegaScout etc.
  • Write a video game (old school, pacman style, raw hardware) running on the O/S you wrote earlier, that relies on real-time cooperating processes
  • Write a video game (new school, multiplayer, rendered) that runs on the graphics library you wrote earlier.
  • A cooperative project where an interface has to be defined and one student or group depends on another to work to the interface. (For example, one group writes a first pass of a compiler that generates intermediate code; the other group writes a code generator for that intermediate code to a target architecture. Working examples to be supplied in case one group fails - but working example to be written deliberately poorly so it is not worth stealing! And as an example for the students to improve on.)
  • Implement a database from scratch. Write an application-specific search engine to exercise database lookups
  • Add persistence as a data attribute to a compiler
  • Implement a free-text search engine
  • Write a complete protocol handler (eg X25 Level 2)
  • Write a terminal emulator
  • Add multi-user terminals to multi-tasking operating system
  • Write a file server client
  • Write a file server

I've personally done most of the above as an undergrad or for amusement later. I've produced teaching materials that have been used successfully by self-motivated students via the Internet - the Static Binary Translation HOWTO - and with some other experienced programmers of a certain age helped motivated youngsters write their first compiler on the Yahoo group "Compilers 101". I strongly believer that Computer Science is fun and that the subject practically teaches itself if the exercises are sufficiently interesting. I also believe that if someone struggles with programming, it's obvious by the end of their first year, and they should go and find some other field that they're good at, and not burden the world with another unwanted mediocre programmer.

The list of practical projects given as an example above is more than I can expect any Computer Science course to take on nowadays and certainly more than I can turn out myself. I would judge that preparing the curriculum materials and test beds for a set of exercises such as the list above would take approximately three people and two years. Perhaps there are some retired lecturers/professors from the 70's who still feel they have a calling to pass on their knowlege who would want to work on this. The materials that would be created by a project like this could be used in 'pick and mix' fashion, not just in classes but by self-motivated learners teaching themselves on the internet. It should however have a long life because the fundamentals are essentially timeless. This teaching material would not become outdated due to new application libraries, new operating systems or new hardware.

The specific examples above are taken from my personal experience and my education at a good University that was one of the first to teach Computer Science - by the people who were at the leading edge of research in the field. I returned to my Alma Mater earlier this year when I organised a conference on the subject of Computer History and coupled it with a reunion of people who had learned their craft in the 60's and 70's. The sentiments I've described above about the decline of Computer Science education were shared by many of the participants, among them successful leaders of industry, and academics from a broad range of institutions.

It may be that the politics of education nowadays - more interested in greater student numbers in order to get tuition income, than in better student quality - will work against a movement that wants a revival in the teaching of fundamental computer science. Perhaps the complexity theorists have won, and good engineering will be forever delegated to a back seat role, but I live in hope that it is not too late to get us back on the right track.

Monday, September 17, 2007

Transferring mail from one gmail account to another using pop3

A year or so back when I set up my Google Apps For Your Domain (GAFYD) email account, I already had the best part of a year's mail in an old regular Gmail account and no way to transfer from one to the other.

GMail did allow you to import from external POP3 servers but apparently you couldn't pull mail from another gmail account. I haven't needed to do that since, but if it still behaves like that, here's the fix!

I discovered that if you did a DNS lookup of the Google POP3 server (pop.gmail.com) and entered the raw IP rather than the domain, it worked just fine :-) Whether it was an internal/external DNS view problem, or they actually did it deliberately by blocking the domain, I couldn't say.

I do have some vague recollection that the pop3 server wasn't accessible from the outside world. I guess GAFYD was able to access it because it was also on the inside...

Let's see if I can find my notes on how to do this... ah, here we are...

Username is 'whatever' if your email is 'whatever@gmail.com'. The Pop3 server is 66.249.83.109 (i.e the IP of pop.gmail.com), Port is 995, "Always use SSL" is checked.

I wouldn't be at all surprised if this workaround still works...

Cool Google-enabled search of all your own bookmarks

The iGoogle widget for searching a list of URLs can be used to create a search engine that searches all the sites you've bookmarked in your browser. If like me you have a somewhat fungible memory, and can remember that you've seen something relevant somewhere on the net but can't remember quite exactly where, then this is the hack for you.

Before you start, export your favorites from your browser. In IE this creates a file bookmark.htm - so find somewhere on the net that you can upload this file, you'll need it later as a URL.

Now, create a new search engine at Google's Coop page. It'll ask you for a list of URLs to start it off. Enter one URL, it doesn't really matter which as you'll be deleting it later. The form doesn't let you start with an empty list.

Once the search engine is created, go in to the management page for it and find the Sites tab where you can add new URLs to be searched. Click the "Add sites" button, and enter the URL of your bookmarks page. Here's the important part: select the check box for "Dynamically extract links from this page and add them to my search engine" and the sub-item for "Include all partial sites this page links to". After it's added, remove your original URL.

You also have the choice of searching *only* these sites, or (as I do) making them the premiere results of your search but allowing all the other normal Google search results to follow your preferred ones.

I bookmark stuff pretty liberally and I have found this to be amazingly useful since I first started using it.

Google Apps For Your Domain meets iGoogle at last!

As an early adopter of GAFYD I was rather disappointed with it, because it pretty much duplicated the functionality of iGoogle, but did so with slightly different mechanisms and a separate independent login mechanism - so I had *two* separate accounts with identical usernames at google, and it was hit or miss which google service used which authentication mechanism.

The most annoying thing was that iGoogle widgets were documented and easy(ish) to write, but GAFYD portlets were restricted to those supplied, with no documentation as to how to write your own.

Well, today I tried a crude hack, and it paid off. I don't know if this was always the case, or if they changed something recently, but it turns out that iGoogle portlets *do* work in GAFYD. All you have to do is ... select your widget and install it at iGoogle(which is easy, there's a simple button to do it at the iGoogle widgets site), then go to the widget on your iGoogle homepage and start as if you're going to share it. Don't share it by email, but ask for the URL to cut & paste. Then edit the URL to remove the front part which is the iGoogle installer, and keep the end part which is the true home of the widget. Remember to change it to http: instead of http%3A ...

Now go to 'add stuff' on your GAFYD home page. Find the small link at the top that allows you to add a portlet by URL, and paste in the URL you extracted from iGoogle.

Bingo! You now have your iGoogle widget running on your GFYD homepage, and if you need to, you can go and remove the first copy from your iGoogle page.

Now if only they would add the tabs :-/

When Good iPods go Bad.

My wife's iPod recently started acting up and wouldn't sync at all. And for some time before, it had been refusing to disconnect after a sync. I tried to debug it by installing iTunes on my MCE, which did recognise the iPod, and detected it as being in a semi-reinitialised state, but it failed every time it tried to finish the reinitialisation.

Turned out the problem was that the iPod drive has to be mapped for iTunes to do its magic... well, since the last time it was plugged in, the drive letter that had been assigned to it was now assigned to a network drive. Unfortunately Windows is so stupid that it really wanted that letter for the iPod, and so when iTunes tried to access the iPod as a drive, it was hitting the network share instead.

By going in to the Manage option of My Computer and reassigning the preferred drive letter in the disk management console, the iPod became visible and iTunes was able to properly reinitialise it. At which point when it was put back on my wife's PC, she was able to use it again just fine.

Which was a relief, because it was an 80Gb unit but out of warranty, and if it had been a hard drive error as originally thought, it would have been an expensive replacement.