Category Archives: Grad School

The Amazing Miss A

Via this post in Michael Trick’s Operations Research Blog, I discovered a truly mesmerizing article about teaching: The Amazing Miss A And Why We Should Care About Her. I hope you would enjoy it as much as I do.

And after you have read Michael’s post and the article, I invite you to think about CS education.

CMU’s WebVPN Proxy

I think this would be worth mentioning for my CMU fellows:

As you know we have campus-wide subscriptions to many web services, like the ACM Digital Library and Elsevier’s “ScienceDirect”. But if you are off-campus, then understandably you cannot take advantage of such subscriptions. There was a time when all you could do was to ssh into a campus machine followed by, say, a wget+scp dance. But these days you can also use the “WebVPN service“, which is I find to be a lot more convenient, and direct.

Bibliometrics in ACM Digital Library

I just noticed that the ACM Digital Library have started to prominently display “Bibliometrics” for each of its articles. Currently, we get the number of downloads in the last 6 weeks and last 12 months, as well as the number of citations. The last of which was available before, but you had to go further down into the page.

In case you want a quick link to see, here is Bob Tarjan’s classic Efficiency of a Good But Not Linear Set Union Algorithm.

My Vote Goes to Red-Black

On Geomblog, Piotr mentioned a problem that I think many of us can share: if we really have to pick just one to teach (as in his case), do we go with 2-4 trees or red-black trees?

For the most part, the pros and cons of the two approaches are quite, urr, balanced. :P But there has been one critical issue that tilt my preference towards RB. On the surface, the issue is “augmentation”, ala CLRS chapter 14 style. But it really is about the number of rotations used during insertions and deletions…

There is a fascinating history behind this, which is summarized by Ottmann and Wood up to 1990. It all started with a design from Olivie and a swift response from Tarjan. The rest of the juicy bits, well, I offer no spoiler. This post merely needs the following fact to make sense: in the worst case, some implementations of RB need only $O(1)$ rotations during restructuring and the remaining work are color changes; some implementations need logarithmic number of rotations.

But if we can afford a logarithmic number of color changes in the worst case, isn’t it just “a matter of constants” even if we have to do a logarithmic number of rotations as well?

In many cases, the answer is indeed yes, but there is no lack of interesting cases due to augmentations that make rotations “expensive”. For a starter: see section 13.4 of Bob’s textbook, pp 570-571 in the C++ 3rd edition, where this very issue is raised and the model answer is presented. Here is a favorite example of mine: Seidel and Aargon used this to highlight why treaps are preferable to splay trees in applications like segment trees. Let’s end with one more related to this post: McCreight’s priority search trees, used as the motivating example in Ottmann and Wood, also critically rely on search trees that use only $O(1)$ number of rotations to restructure. More precisely, doing $p$ rotations yields a bound of $O(p \log n)$. See, it’s not just in the constants!

I will finish this by saying that this doesn’t imply a landslide victory for RB. Conceptually, 2-4 trees are really hard to beat! In fact, even though I would be speaking RB in my mouth, I would be thinking about 2-4 at the back of my mind! (Kind of like some foreign language speakers, no? Ten years ago I was exactly like this, haha.) And really, if we ever get to more advanced stuff like supporting finger searches, I will pick 2-4 on most but not all days. Complications… complications… :P

Exercise: To understand how a 2-4 speaker would speak in RB, what is “black-height” in the 2-4 language?

Reference: Thomas Ottmann, Derick Wood: How to Update a Balanced Binary Tree with a Constant Number of Rotations. SWAT 1990: 122-131

Course Announcement: Special Topics in Combinatorial Optimization

47-853 Special Topics in Combinatorial Optimization: Iterative Relaxation and Rounding

Time: Mondays and Wednesdays 3.30-5.30 pm from August 27 – Oct 15 2007.

Place: Room 318, GSIA (old) building

Instructors: R. Ravi and Mohit Singh

Polyhedral methods have been a powerful tool in combinatorial optimization. In this course we will study exact linear programming formulations for a variety of problems which are polynomial time solvable. In particular, we will explore the effectiveness of a new technique based on iterative augmentation in proving some traditional results. We will also examine min-max relations and efficient algorithms for these problems. Towards the end of the class, we will illustrate the original iterative rounding method that inspired these results as well as some applications to approximation algorithms.

Tentative Outline:

1. Linear Programming
2. Bipartite and general matchings
3. Trees and connectivity augmentation
4. Matroid bases, intersection of two matroids
5. Union of matroids and packing trees
6. Submodular flow minimization
7. Graph orientations, Dilworth’s and Lucchesi-Younger theorems
8. Network design and degree constraints

Grades will be based on revising and correcting already prepared preliminary scribe notes, and on one or two problem sets over the course of the class. Solving any open problem posed in class will automatically absolve you from other work!
We require all attendees to register, if only for audit.

A Short Note on Ergonomics

A lot people around me have Repetitive Strain Injury (RSI), and it seems to me that they are constantly in search of that “ergonomic keyboard”, or that “ergonomic mouse”, or that “ergonomic something”.

Now I honestly do not know if such devices exist, but I do have an observation when it comes to preventing / treating RSI: our lives practically demand the Strain, and therefore, it seems wiser to deal with the Repetitiveness.

Here is the punchline: buy a bunch of keyboards and pointing devices, each different in their design. Then rotate between them once every two weeks or so.

This rotation scheme has served me very well. I can only hope it works for you. I am not a therapist.

The crucial point to note is that these devices should be as different from each other as possible so that they strain your muscles somewhat differently. For instance, if you just go out and buy ten optical mouses, that probably won’t work very well. Perhaps rotating between just one trackball and one mouse will serve you better.

I conjecture that this “strain your body differently” principle is what some therapists are solely relying on, whether they are consciously doing so or not.

One of my friends, regardless of how “ergonomic” her keyboard is, still needs to go see her therapist once in a couple months. And every time she sees her therapist, the therapist’s recommendation always results in a change. Maybe the mouse is different this time, maybe the keyboard tray’s height will be adjusted. For the first few weeks after the session, the change “works”. But the problem will manifest itself again afterwards, so she goes see her therapist… and the cycle repeats.

Note that I am not arguing that no device is “more ergonomic” than the others. For example, I actually believe if you want to stick to one keyboard in your whole life, you want to stick with Kinesis Contoured.

But really, the ThinkPlus USB Keyboard with UltraNav is also very nice. Now, throw in a 3M vertical mouse and a very simple Logitech trackball. Together with the keyboard and mouse that you already have, you have just built a very cool equipment pool.

Finally, I note that this rotation model should work very well if you have a lab or research group that can pool the equipment budget.

P.S. As for chairs, I don’t have the money to buy a couple fancy chairs. Since I don’t use air conditioning much, I like Aeron best. I just keep adjusting my Aeron during the day before I get fatigue in a pose. I know some people say Leap or Freedom or something else is better, but who’s going to buy it for me? :P

Theory Lunch Food

As I’m sure you all have noticed, we’ve been having either Bagel Factory sandwiches or Napoli pizza at every theory lunch this semester. As one of the co-organizers, ordering the same thing every week makes my life easier, but even I’m getting a little sick of it. So if anyone has any suggestions for food, please let me know, either by e-mail or in the comments. The only constraint is that, including delivery and tip, it can’t cost more than 200 dollars and has to feed about 30-35 people (we like to err on the side of having extra food rather than running out). We also like to have some veggie stuff for you vegetarians and some meat stuff for the rest of us. We’ve been ordering from Wheel Deliver since they’re used to dealing with the University and it’s nice to have a centralized place to order everything, so ideally we’d keep using them, but I’m open to other suggestions too. So please, give me some ideas, since otherwise we’ll probably just keep alternating between sandwiches and pizza. I’d even be interested in whether you like the sandwiches or the pizza more. I know that I prefer the pizza, but apparently a fair number of people disagree with me.

See you all at lunch on Wednesday!

An Idea on Theory Exam Questions

In a discussion with a fellow student about grading, we came up with the following method which can make some Theory exam question with longer proofs easier to grade. I believe that, when used with caution, it can be a good tool to have.

Let me stress the word “caution”, since it is not a good idea to use this method on all problems on all exams.

The basic idea is to give the proof away, but in a scrambled fashion. Suppose the proof can be broken into $k$ “units of thoughts” (each unit is probably one or two statements). We will first append some $n-k$ junk units to the end of the proof, yielding a total of $n$ units. Then, we take a random permutation of the $n$ units and give the list to the students. The answer to such an exam question would then be an ordered sequence of numbers indicating the relevant units in the list.

There are many variations on this idea.

First we need to pick $n$. That clearly should depend on $k$. (Somewhere around $1.5k$ to $2k$ may be a good idea.)

Then we may decide to reveal the value of $k$ or not. (Revealing makes it a lot easier.)

Also, some of the junk units may actually be mathematically wrong but “reasonably sounding” in the context of the proof. We call these bombs. If the answer has a bomb, it automatically gets a zero.

And should we choose to add bombs, then maybe it is also an interesting twist to allow the students to pick out exactly all the bombs. For some classes, being able to pick out exactly all the bombs in a question should really qualify for a full credit for it. (Oh, Number Theory…)

Now that leaves the question of partial credits, which can be a highly subjective matter in the traditional “write-a-proof” format. But in this “pick-out-the-proof” format, it may become easier if the junk units are prepared carefully.

Finally, notice that the permutation needs not be the same for different students. This can be good for rooms with little elbow room. Grading is now a bit harder but should not be too bad.

Advising and Project Management

Several days ago, David asked how to evaluate an adviser and later Lance replied with his post of wisdoms.

David pointed out that he knew of relatively few resources on how to be an adviser. (His post has one link on this.) While I am not ready to write a post of experience like Lance did, I do have a thought to share.

Not long ago, CMU CS alum Scott Berkun gave a talk based on his book on project management. In his talk, Scott told us many interesting stories about his project management experience back in Microsoft. What struck me was his lists of items of “do”s and “don’t”s in management and his list of questions to ask the team members as a project progresses into different stages.

While Scott was focusing on software projects, I believe some of his ideas are applicable to academic advising. For example, he suggests that we should explicitly ask our team members (students) “Do you think you have all the resources you need for you to succeed?” and “Do you feel that this meeting is a waste of your time?” (Whether you get an honest answer back is another matter. And if you don’t expect you can get an honest answer, then you already have a complicated issue to tackle.)

In hindsight, I suppose it’s hardly surprising that academic advisers can draw experiences from software project managers. They are both managing smart people in a creative process. While the two fields have many differences, the human factor can be pretty much the same.

P.S. Scott’s background is in software project management and the book is not directly written for academic advisers. It takes some effort to relate his ideas with advising. So don’t rush out to buy the book and then tell me that you can’t find a short list on how to be an adviser. There isn’t.

I have written you a long paper…

Roy Levin, a friendly CMU alum, told us a story a couple weeks ago:

A job applicant was asked to write a 10-page description of a project he previously participated. The documentation of that project was well over a thousand pages and so he said there was no way to describe it in 10 pages… (The rest is history. :P )

Then Roy offered the following wisdom:

In a field that prides itself with the very idea of abstractions, everything can be explained in 10 pages. In fact, everything can be explained in one page. Good authors abstract the material to an appropriate level.

I suppose everyone agrees with his advice, but I wasn’t fully aware of that property of my field until he said it. I could have been doing it subconsciously before, but I do it consciously from that day on.

Yet it takes time and skill to do the abstraction right. I’ve seen positive and negative examples. In this regard, I remember a quote from Mark Twain, or Blaise Pascal, or really, Google:

I have written you a long letter because I did not have time to write a short one.

Some days I need to keep screaming in my head: I can explain this lucidly in 10 pages!

Quicksort Alike

Recently I had to “analyse” randomized Quicksort over a dinner table. Well, I wasn’t prepared to do it using only a pair of chopsticks and a bowl. So I “cheated”. :P

Instead of picking a random element and accept it as the pivot, we let the partitioning algorithm test if the random element is a balanced pivot, i.e., it sits between the 25-percentile and the 75-percentile. This test takes exactly linear time. And if not, the algorithm will pick another random element and re-test.

Clearly the probability of picking a balanced pivot is 1/2 and so the partitioning algorithm will finish in an expected constant number of trials. The rest of the proof is easy and using five more minutes I “proved” the expected $O(n \log n)$ bound with surprisingly little hand-waving. I also ate a lot of garlic bok choy in the process.

Now here are some simple questions that arise from this incident:

  • What about finding the median as the pivot? (And we have a choice of randomized and deterministic median algorithms.)
  • What about the concentration? Is this $O(n \log n)$ “with high probability”? (For instance, what is the probability that it takes 42 trials before a balanced pivot is found?)
  • Can we use a similar view to analyze the running time of the unmodified version of Quicksort? (Try to come up with a clustering of the nodes in the recursion tree and analyze the running time spent in each cluster of nodes.)

More homework questions…

Collecting Feedback for Talks

Every since I started giving talks, I wanted to know how my talks really were. But then, it seems difficult to collect the real opinions because of various reasons. (Diplomacy, friends, countrymen, comrades, etc.) For a start, I want to be able to conduct a thumbs-up/down vote as publicly as possible, but also preserves the secrecy of the votes.

Being at CMU actually makes this problem very easy to “solve”. All I did was to walk into a real crypto person in the corridor the very next morning. And before I finish telling my story, I am already presented with a proposal that is actually executable by real human beings. The trick is to have a way to distribute secret random bits.

Here is how one system may work to ask each person “Is this a good talk?”. At the door, each person is dealt with a card from a shuffled deck. Half of the cards in the deck have “This is a good talk” written underneath it, and the other half have “This is a bad talk”. That’s the secret bit for each person.

Two bins are placed at the door: “I agree” and “I disagree”. At the end of the talk, you just put the card (faced down, of course) into the appropriate bin. As long as your secret bit is not revealed, no one could be sure what your vote really was from observing which bin you used.

I can think of a number of ways to extend the number of questions asked in various ways, all involving secret initialization bits and XOR’ing with the answer bits.

So what questions do I want to ask? After asking around, it seems that the following questions are more interesting:

  • Do you understand the problem statement?
  • Do you think the problem is interesting?
  • Do you understand the proposed solution in the high level?

This works better than pure thumbs-up/down because we may want to distinguish YNN from YYN.

I guess I will use it in my next talk and see how it goes.

Colloquium on CS Pedagogy 2005-11-01

Tuesday, November 1
3:00 p.m.
3305 Newell-Simon Hall

“Thoughts on Introductory Computer Science Education”
Peter Lee, Professor, School of Computer Science, Carnegie Mellon University

Computer scientists and computer science educators are constantly reassessing the introductory curriculum. The intensity of this reassessment has increased, in large part because it is seen as relevant to the recent concerns about declining enrollments and interest in the field.

For the most part, introductory computer science has emphasized hands-on experiences and the development of programming skills. While hands-on skills development has been a mainstay of almost all introductory laboratory-science courses in, for example, biology and chemistry, introductory CS has been unusual, both in its heavy use of “industrial-strength” tools, such as the C++ programming language and commercial web-development systems, as well as the extent to which skills are emphasized over other aspects of the field, such as its history and scientific (or mathematical) underpinnings. In a sense, intro CS has attempted to serve not only as an introduction to the field but also as a valuable form of vocational training.

In this talk, I will give my thoughts on the introductory curriculum. I will argue that the use of industrial-strength tools is often inappropriate, just as it would be in, say, a high-school chemistry course. Instead, I believe that a more well-rounded curriculum that provides a balance of skills development, mathematical foundations, and historical context would be both more appropriate and more effective in inspiring young people to enter the field and understand its scope.

Insert Blank Slides During Talks

Some days ago Anupam asked me how to insert a blank slide right after the current one during a slide show. It may seem curious to many people, but if you teach with a Tablet PC, then it makes perfect sense. This is analogous to having blank transparencies (and the Tablet stylus as the marker) when using an overhead projector. It’s especially useful when a question pops up from the audience and you would like to address it with the help of writing things down. And if you are using PowerPoint 2003, you can retain all the handwritings you made during the slide show and distribute them together with the exisiting slides.

The straightforward way of doing it is to break out from the slide show, insert a blank slide and then resume the show. But do we have a slicker way?

Turns out this is not hard to do in PowerPoint. First, press Alt-F11 (Tools->Macro->Visual Basic Editor). Select Insert->Module in the main menu. Now enter the following code:

Sub InsertBlankSlide()
Dim newIndex As Long
With ActivePresentation
newIndex = .SlideShowWindow.View.Slide.SlideIndex + 1
.Slides.Add newIndex, ppLayoutBlank
.SlideShowWindow.View.GotoSlide newIndex
End With
End Sub

Select File->Close and Return to PowerPoint. Now go to the Master slide and create a new shape there. Right click on that shape and select Action Settings. In the Mouse Click tab, select Run Macro and pick InsertBlankSlide from the combo box underneath it. Click OK and there you go. (Try running the presentation and click on that new shape.)

For a more elaborate example that also includes changing pen type, I refer you to Anupam’s slides in 15-251 (CMU’s 200-level CS theory core). For example, look at this one. It’s true that you can change the pen type by pressing hot keys during a slide show, but if you are using a slate Tablet PC, then you don’t have a keyboard.

P.S. It would be nice if we can change the pen weight programmatically since the default pen weight is a bit too thick. However, I haven’t spent enough time to get it working. All I know is that this is not possible in the object model that PowerPoint exposes. That, however, only means this cannot be done easily (as I have an example here). On the other hand, Anupam has pointed out that you only need to change the pen weight once during a slide show and the weight will not change during the rest of the show. I guess it’s easier to wait for Office 12 to come out than to nail it. :P

Talks in Portrait Mode

Have you ever seen people giving talks in portrait mode? I think all presentation softwares default to landscape mode since the projectors are usually in landscape. So the choice is deliberate.

Here is a calculation. A 4:3 display at 1024*768 has 786432 pixels. If we maintain the 4:3 aspect ratio in landscape, that would be 768*(768/4*3)=768*576 = 442368 pixels, a 43% decrease. Now the speaker may have actually used an aspect ratio of 11:8.5 because of letter sized paper, but I wouldn’t be able to tell since that gives 768*(768/11*8.5)=768*593=455424, still 42%.

So I am curious, why would anyone not want to use those 40% of the available pixels? Do you know what the reason may be?

Architects, Chefs and Academic Research Scientists

According to this article in CNN Money, these three jobs have at least one thing in common. Since I am not yet interested in being an architect and I can’t stand the heat in the kitchen, my perspective would be that

A career with one of the most disproportionate ratios of training to pay is that of academic research scientist.

Perhaps a Jedi knight training would have gotten me more pay. Patience…

And it’s made tougher still by the fact that in many disciplines, there aren’t nearly as many tenure-track positions as there are candidates.

I can easily imagine this being the case for liberal arts subjects, but perhaps this is also true, or getting more true, for Computer Science too. I don’t have figures to know the situation.

At the end of the article is this inspiring, surprising list of six-figure jobs: stunt driver; auctioneer; matchmaker; head groundskeeper; and fashion-trend forecaster.

I predict that next year’s fashion will be different from this year’s. Really, that makes my mom feel proud.

Really Bad PowerPoint (and How to Avoid It)

http://www.sethgodin.com/freeprize/reallybad-1.pdf

The first thing that most people use PowerPoint for is a TelePrompter!

Indeed, and even though the presenter mode can show you notes, it doesn’t really help unless you tend to stand at the podium. But if you stay there too often, then you surely will become a voice, or worse, noise in the background because the audience will look at your slides and not you.

At the end of the day, technology didn’t take away the basic requirement of delivering good talks: you need to be very, very familiar with your own talk. Then you can walk around and still know what should be said and what’s coming up next.

Wireless Presentation Controls

Over the years I have tried around 10 wireless presentation controls. Three of them are quite notable.

All devices I tried are all Radio Frequency based, so there is no line-of-sight requirement as in Infra-Red devices. (Contrast Wavelan with your TV remote.)

My criteria for a good control is very simple: if the control is within 50 feet of the receiver with no obstruction in-between, then all clicks must be received properly. Not a single missed click is allowed. (50 feet is about the width of the largest projection screen here at CMU.)

To achieve this level of guarantee, there must be (i) tactile click feedback, or else you can’t tell if you really clicked or not, and (ii) a weak battery indicator, or else you can’t tell if it’s a battery problem or a radio problem.

The ones that I like include:

  • Gyration Ultra

    This is by far the champion in terms of operation distance. It comes with two rechargeable battery packs and a battery pack adaptor that allows you to use 3 AAA batteries.

    It’s a gyroscopic mouse, meaning that if you hold onto a special button, then moving your hand in the air will move the mouse cursor. With this feature, I can point to an object with my mouse cursor. I prefer it to a laser pointer (which Ultra doesn’t have) since the mouse cursor can stay at a fixed location (no shaking red dot) and I cannot make the mistake of shooting laser into the audience. It also has a mouse wheel that acts as the third button and lets you flip through slides if you need to.

    I do have two small complaints about the current design: (i) There is a sensor at the bottom of the mouse to sense whether the mouse is in the air or not. The light generated by this sensor can be distracting to the audience. So I try to point the mouse down while giving talks. (ii) There is no way to turn off the mouse. You have to remove the battery pack.

  • Logitech Bluetooth Cordless Presenter

    It’s a bit bigger than Gyration Ultra and it also has great radio performance. It uses 2 AA battery.

    The most important advantage of this device is its click handling. Once you made a click, the device will try its best to get the signal across. In my testing, some clicks will be received several seconds after clicking and that makes me think there is retransmission going on. There is even a indicator light that tells you whether the transmission is successful or not. I really like this approach.

    My complaint about this one is that it is too easy to hit the laser pointer button by mistake. The button is large and soft and is positioned right at the bottom for you to easily make that mistake. Also, while it’s Bluetooth, I have not been able to pair it with my ThinkPad X30′s Bluetooth receiver. The mini-receiver that Logitech provides work well though. Overall a worthy competitor to the Ultra.

  • Honeywell Power Presenter

    I would guess that this uses exactly the same radio circuitry they use for their garage door remote. The size of the control is similar to the size of the garage door remote, so it’s quite a bit smaller and lighter than the previous two. It also features a laser pointer. It uses CR2032, so always carry an extra with you.

    It works well if you know this trick: move your hand while you are holding down a button. Somehow this movement is necessary to ensure the click goes through. Once the click has been received, you can release the button. This can take up to about half a second.

    The complaint I have is the awkward click handling (which admittedly I have already gotten used to). I prefer the retransmission approach and it matters since this one does not have as good a range as the previous two.

The one thing I did not test is security proofness. I can imagine that at a conference, there can be multiple identical products being used in close distance. Well, if the speaker in the next room can also advance your slides… :P

P.S. I also note that these days some Bluetooth-enabled cellphones can be used as a presentation control. Check out floAt’s Mobile Agent if you have a Sony Ericsson phone. I also remember there is a commercial software that does similar things using the volume keys on Bluetooth cellphones.

P.P.S. If you are at CMU, Luis has the Logitech and I have the Gyration and the Honeywell. Adam also has the Gyration.

Principles of Effective Research by Michael Nielsen

http://www.qinfo.org/people/nielsen/blog/?p=120

The philosophy underlying the essay is based on a famous quote attributed to Aristotle: “We are what we repeatedly do. Excellence, then, is not an act but a habit.’’ Underlying all our habits are models (often unconscious) of how the world works.

Thanks to Aristotle, and Nielsen. :P

You and Your Research

Note: it seeems that the web has no lack of copies of this excellent talk. This one is copied from http://www.cs.virginia.edu/~robins/YouAndYourResearch.html. A PDF version is available at http://www.ocf.berkeley.edu/~wwu/readordie/hamming.pdf and here is a local copy.

This talk really makes me think a lot… but…


Richard Hamming


“You and Your Research”

Transcription of the
Bell Communications Research Colloquium
Seminar
7 March 1986


Read more »