Archive for March, 2007

ACO Seminar 2007-03-29

Title : Turan problems on the hypercube
Speaker : David Offner
When : 4:30-5:30pm, March 29th
Where : PPB 300

Abstract: Turan’s theorem gives the number of edges possible in an n-vertex graph with no k-clique. Since this result was proved in 1941, numerous generalizatons and variations have been studied. In this talk, we survey some conjectures, problems and results on Turan-type problems where the base graph is a hypercube.

In particular, we define c_d to be the minimum proportion of the edges which must be removed from any hypercube so that it does not contain a d-dimensional subcube as a subgraph, and p_d to be the maximum number of colors with which one can color the edges of any hypercube such that any d-dimensional subcube contains every color. Note c_d < 1/p_d.

We prove the exact value of p_d for every d, and present some observations about c_d.

Includes joint work with Oleg Pikhurko.

Comments

Theory Seminar 2007-03-30

Friday, March 30th, 2007
3:30pm
WEH 7220

TITLE:
An O(n log n) algorithm for maximum st-flow in a directed planar graph

Glencora Borradaile
Brown University

ABSTRACT:

We give the first correct O(n log n) algorithm for finding a maximum single-source, single-sink maximum st-flow in a directed planar graph. After a preprocessing step that consists of finding single-source shortest path distances in the dual, the algorithms consists of repeatedly saturating the leftmost residual s-to-t path. While the algorithm is very simple, the analysis is more complicated.

I also will overview other planar graph algorithms that we are working on.

Comments

Theory Lunch 2007-03-28

SPEAKER: Adam Wierman
TIME: Wednesday 12-1pm, March 28, 2007
PLACE: NSH 1507
TITLE: Fairness in queues

ABSTRACT:

The growing trend in computer systems towards using scheduling policies that prioritize jobs with small service requirements has resulted in a new focus on the fairness of such policies. In particular, researchers have been interested in whether biasing towards small job sizes results in large jobs being treated “unfairly.” However, unfairness is an amorphous concept and thus difficult to define and study. In this talk, I will present some recent attempts to define and study the concept of fairness in single server queueing settings. Rather than providing the unique, correct definition of fairness for queues, my goal in this talk is to illustrate, compare, and contrast, the range of fairness measures that have been suggested and to summarize what interesting open questions remain for each.

Comments

Theory Seminar 2007-03-22

MARCH 22ND, 2007
3:00PM
WEAN HALL 4625

TITLE:
Tight integrality gaps for Vertex Cover SDPs in the Lovasz-Schriver Hierarchy

SPEAKER: Toniann Pitassi, University of Toronto

ABSTRACT:
A vertex cover in a graph G=(V,E) is a subset S of the vertices such that every edge intersects S in at least one endpoint. The minimum vertex cover problem asks for the size of the minimum vertex cover in G. Determining how well we can approximate vertex cover is one of the outstanding open problems in the complexity of approximation. The best known approximation is a factor of 2, whereas the best inapproximability result of Dinur and Safra shows that it is NP hard to solve the problem within a factor of 1.36.

To get a better picture of the approximability of vertex cover, in light of our inability to resolve the issue with PCP-based methods, Arora-et-al sugggested ruling out good approximations for vertex cover by large families of algorithms. One such important family is the class of relaxations for vertex cover in the Lovasz-Schriver hierarchies.

In this talk, we will survey what is currently known, and present a new result showing that the integrality gap for vertex cover is 2-o(1), even after $\sqrt (\logn / \loglogn)$ rounds of the SDP LS+ system of Lovasz and Schriver.

This is joint work with Konstantinos Georgiou, Avner Magen, and Iannis Tourlakis.

Comments

ACO Seminar 2007-03-08

Title : Understanding Parallel Repetition Requires Understanding Foams.
Speaker: Ryan O’Donnell
When : March 8th, 4:30-5:30pm
Where : PPB 300

Abstract:

Motivated by important problems in complexity theory (the Unique Games Conjecture) and the foundations of quantum mechanics (QM vs. local hidden variable theories), we investigate the best parameters obtainable in the Parallel Repetition Theorem.

Unfortunately, we don’t get very far.

But, it’s because we get blocked by the following seemingly very hard question from the geometry of “foams”: What is the least surface area of a cell that tiles R^d by the lattice Z^d?

Very little about this foam problem is known. It is so vexing that I will offer monetary rewards for baby steps of progress.

This is joint work with Uri Feige (Microsoft Research/Weizmann) and Guy Kindler (Weizmann).

Comments

Theory Seminar 2007-03-09

FRIDAY MARCH 9th, 2007
3:30pm
WEH 7220

Every linear threshold function has a low-weight approximator

Rocco Servedio
Columbia University

A linear threshold function, or halfspace, is defined by a hyperplane w•x=θ through n-dimensional Euclidean space; it assigns a binary label to each input point according to which side of the hyperplane the point lies on. Linear threshold functions are well studied in areas such as learning theory and complexity theory; in particular, linear threshold functions with small integer weights are often of special interest.

Given any linear threshold function f on n Boolean variables, we construct a linear threshold function g which disagrees with f on at most an ε fraction of inputs and has integer weights each of magnitude at most n½ • 2^{Õ(1/ε2)}. The construction is optimal in terms of its dependence on n.

We give two applications. The first is a deterministic algorithm for approximately counting the fraction of satisfying assignments to zero-one knapsack problems to within an additive ±ε. The algorithm runs in time polynomial in n for any constant ε. In our second application, we show that any linear threshold function f is specified to within error ε by estimates of its Chow parameters (degree 0 and 1 Fourier coefficients) which are accurate to within an additive ±1/(n • 2^{Õ(1/ε2)}). This is the first such accuracy bound which is inverse polynomial in n, and gives the first polynomial bound (in terms of n) on the number of examples required for learning linear threshold functions in the “restricted focus of attention'’ framework.

Comments

Switching From Yap to Dviout

(Most recently updated on 2008-02-08, but I have kept the referenced MiKTeX version at 2.5. As far as I know, none of the changes in MiKTeX 2.6 and 2.7 matters to the topic of this post.)

I have recently upgraded my MiKTeX to the latest version 2.5 for some reason. And as much as I love MiKTeX, I have switched from YAP (Yet Another Previewer, the DVI previewer in MiKTeX) to Dviout.

Dviout may not be very familiar to non-Japanese users because, to my understanding, it is primarily developed to deal with Japanese usage (think about fonts). However, it works equally well in English too. I can recommend Dviout over YAP 2.5 for several reasons:

  • Dviout pretty much does everything YAP is capable of: color specials, HyperTeX jumps (you use the hyperref package, don’t you?), and perhaps most importantly, round-tripping using source specials (but see the note below).
  • Unlike YAP 2.5, it does not seem to have any DVI locking issue, at least not in my daily usage of working on a large document. (To me, this alone is worth the switch until YAP eliminates the lock issue.)
  • In my experience, Dviout is much faster than YAP ever since it had a major change in version 2.5. Unlike YAP, Dviout does not make you choose between “dvips” vs “pk font” rendering—the former is slow, and the latter cannot be used when your document has any graphics. There is really no good choice in YAP.
  • The particular document I am working with is also somewhat graphics intensive. Like YAP, Dviout is able to call Ghostscript to show the included graphics in the DVI preview. But it also caches the generated bitmaps and use them later for unmodified figures. This saves me a lot of time. (And this is the primary reason why I expect to stick with Dviout for a while. I sprinkle \parpics and wrapfigures quite liberally on my spaghetti.)
  • Dviout allows you to customize the shortcut keys. I like to exercise my left hand more often and so I set some shortcuts on my left hand.
  • Dviout can do a text search directly inside DVIs. This can be handy if all you have is a DVI file. (Package documentation comes to mind. And yes, you could have converted it to PS and then do the search in the PS…)

(Note: there is an issue that I am still puzzling on. In the forward search, sometimes Dviout will land on a few paragraphs earlier than the intended destination. I will update this post once I figure out the solution.

Update 2007-03-07: Nailed it. First, recall that source specials are implemented by inserting tags that specify the source file and the corresponding line number into the DVI file. The problem is that sometimes the line numbers in the tags are not in increasing order. Here is how the stream of tags might look like: {src:1003main.tex}{src:1055main.tex}{src:1015main.tex}{src:1019main.te x}. In this case, if you ask the DVI previewer to go to line 1015 of main.tex, a simple scan algorithm will hit the second tag and then incorrectly conclude that no further reading is necessary. This is exactly what happened. Workaround? I guess I will have to wait until I finish my current project and patch the source directly.)

Below I will show how to get Dviout working with MiKTeX 2.5. (Update 2007-11-15: The same instruction works for MiKTeX 2.6 because there is no change in directory layout.)

  1. Remove the directory C:\localtexmf\fonts if you have one. It contains your existing PK font files. We can afford to regenerate them. MiKTeX 2.5 by default put PK files in a different location too.
  2. Create a new file dviout.par and put it in the same directory with dviout.exe. It’s the preference file for Dviout.
  3. I have included the non-default parameters that I use at the end of this post. You can use it as your first dviout.par and start from there. As you can see, there are some hard-coded path names that you will need to change. The following is a list of parameters that you want to pay attention to, in the order of appearance in my dviout.par below:
    GS
    This means “On” with “Direct PS” turned off in the Graphic tab. Keeping Direct PS off can mitigate a class of problems due to dvi files prepared with source specials containing postscript. For instance, you can generate such a dvi file by using hyperref with the dvips driver, which is usually the default. The right driver for dviout is hypertex, used like this \usepackage[hypertex]{hyperref}.
    gsx
    You may have installed Ghostscript somewhere else. I note that MiKTeX comes with Ghoscript these days, but I usually keep an installation of the most recent Ghoscript so I don’t use the one that comes with MiKTeX.
    gdat
    All my documents store their figures in a figure subdirectory. The current setting tells Dviout to cache the preview bitmaps in the same subdirectory instead of the parent directory where the DVI is located.
    gen
    I installed my MiKTeX there. The default for 2.7 would be `C:\Progra~1\miktex~1.7\bin\makepk.exe ^s ^d ^D ^M. There should be no space in the path name and so you must use short path. Also, be careful with the backquote character.
    key
    I have added four shortcut keys to my config. “z” and “x” are for back and forward, “-” and “=” are for zooming out and in. Also, note that “space” is by default mapped to next-page, and so the pair “z” and “space” is all I use most of the time. Remember, the shortcuts are fully configurable in the Option->Set Parameter->Key tab.
    BMP and scale
    I have made an unusual choice here because of my late-night TeXing. The negative numbers represent negative gamma, which is to say that the DVI text is displayed in white on black. If you want the normal black on white, change the -400 to 1100. The particular number depends largely on the font you use and your monitor, but 1100 is in the ball park. (This answers rmcd’s comment.)
    TEXROOT
    For MiKTeX 2.6, the current setting is a sensible choice. This is what ^r in the TEXPK option expands into.
    TEXPK
    If you have installed MiKTeX 2.6 at its default destination, use TEXPK=”^r\fonts\pk\\dpi^d\^s.pk;C:\Program Files\MiKTeX 2.6\fonts\vf\\^s.vf” instead. You can read my comment on 2007-03-23 to understand what it does and how you can come up with the right setting for your machine with the help of YAP.
    TEXFONTS
    According to the Dviout manual, this option is needed when you cannot generate a PK font but you want to replace the missing characters with spaces of the correct size, hence the need to locate the TFM (font metric) files. I would say that if you cannot generate an English PK font, then you want to resolve the issue completely instead of relying on this feature.
    src
    I use GNU Emacs on Windows since a long time ago and I am still at version 21. That explains why I picked gnuclientw as my src editor. In the comments, rmcd explained how to use emacsclient instead. Dviout supports DDE as well and so you can use many other editors. See the help file for instructions.
    gsrc
    I use Ipe for most of my graphics. That explains the path in gsrc.
    isp
    With the GS option above, we have turned off Direct PS. This “ignore some source specials” setting further suppresses a whole bunch of (mostly harmless) warnings. There are pros and cons, but I think this together with the GS setting above is a sensible default.
    file
    This is where you put dviout.par. Of course, you don’t really change this setting by changing it inside dviout.par, but this is a non-default setting with a hardcoded path. :P
  4. Finally, to make AUCTeX use Dviout instead of YAP, here is a snip from my tex-site.el. Note that I am showing you Emacs Lisp here and as rmcd pointed out in his comment, you do not want to enter the backslashes before the two inner double-quotes if you are using Emacs’s customize-variable command to change your TeX-view-style. The backslashes are only necessary in Emacs Lisp so that we escape the two inner double-quotes.

    (setq TeX-view-style
    '(("^a5$" "yap %d -paper a5")
    ("^landscape$" "yap %d -paper a4r -s 4")
    ("^epsf$" "gsview32 %f")
    ;;("." "yap -1 -s%n%b %d")
    ("." "dviout -1 %d \"# %n %b\"")))

Here are some usage tips on Dviout:

  • As much as I prefer not to ask you to touch the registry, you want to know that Dviout remembers some of its settings in HKCU\Software\SHIMA\DVIOUT. For instance, it stores the window dimensions there. (I set my xPos, yPos, Height and Width in the registry once and for all.) I believe that all options set outside of Option->Set Parameters, such as Option->Continuous Renew, are stored in the registry.
  • You may also want to disable Options->Check color specials from the menu in case Dviout complains about your DVI file. It happens when your color specials have certain issues as explained in the manual. I meet these issues too frequently.
  • Make sure you enable Options->Setup Parameters...->System->Auto Renew. (It’s on by default, but it’s worth checking in case you turned it off by accident.) It forces a update check when you switch into Dviout from another application. This behavior is shared among most, if not all, DVI previewers.
  • Options->Continuous Renew is a good thing if you have a large monitor or a dual monitor to let you see both your text editor and Dviout without overlap. This option will cause Dviout to pop into foreground when the DVI file is updated. Using Microsoft’s Tweak UI, in General->Focus, enable Prevent applications from stealing focus and set it to flash the taskbar button one time. That will make Dviout always show you the latest DVI without stealing the focus from your text editor. (I use this with the continuous mode in latexmk.)
  • Most importantly, once you have customized the options using Option->Setup Parameters to your own liking, select Option->Non-default Parameters. A small text box will pop up and you can copy and paste its content into your dviout.par. (That’s how I get the file below.)

Have fun with Dviout!

Finally here is my current dviout.par (last updated 2008-02-08). I have also uploaded it as a file to avoid line wrapping and other funny things due to quotes.

br=0x800000
bf=0x800000
bb=0x1d4c000
multi=4
button=+
dpi=600
GIF=5
GS=17
gsx=C:\Progra~1\gs\gs\bin\gswin32c.exe^-IC:\Progra~1\gs\fonts;C:\Progr a~1\gs\gs\lib;C:\Progra~1\gs\gs\Resource^-dTextAlphaBits=4^-dGraphicsA lphaBits=4
gdat=./figures
gsize=+
search=448
sdpi=600
hyper=0x1d030
hbuf=0x400000
ToEdit=^c^V
FB=2
varf=+
gen="`C:\texmf\miktex\bin\makepk.exe ^s ^d ^D ^M"
y=F614.295pt:794.96999ptP
key=DR=-:DM==:JL=x:JF=z
log=C:\temp\dviout.log
BMP=6:6:-400
scale=2:2:-400:4:4:-400:2:2:1100:4:4:1100
TEXROOT="C:\Documents and Settings\All Users\Application Data\MiKTeX\2.6"
TEXPK=^r\fonts\pk\\dpi^d\^s.pk;C:\texmf\fonts\vf\\^s.vf
TEXFONTS=C:\texmf\fonts\tfm\\
src='gnuclientw^s+%d "%s"'
gsrc=.eps.pdf=C:\UnixHome\ipe\bin\ipe.exe^s"%n.ipe"
isp=ps:SDict;!
file=c:\unixhome\dviout\dviout.par

P.S. I should mention that staying with MiKTeX 2.4 would not be a bad choice if I could… You should also be reminded of potential issues if you decide to upgrade to Vista before MiKTeX 2.6 comes out.

Comments (23)

Thesis Proposal 2007-03-12

March 12, 2007
2:00 PM, 7220 Wean Hall

Srinath Sridhar
Thesis Proposal

Title: Algorithms for Analyzing Intraspecific Sequence Variation

Abstract:

Analysis of human genetic variation has gained significant momentum due to the success of many large-scale sequencing projects. Within the last five years, about 4 million single nucleotide polymorphisms (SNPs) have been genotyped over hundreds of individuals belonging to several ethnic groups. Recently, researchers have made significant progress in detecting and cataloging copy number polymorphisms (CNPs) as well. These large-scale efforts have now made it possible to analyze genetic variation within a single species, mainly humans, at very fine-scales. In our work, we address a fundamental question: how can we explain and understand these genetic differences that are present within a single species?

In this work, we primarily focus on two methods to analyze SNP variation data within humans. We develop maximum parsimony phylogenetic tree reconstruction algorithms that are specifically catered to work on SNP data. Such a phylogeny should cluster closely related individuals (perhaps an ethnic group) together. Therefore, these techniques are widely used to answer questions on human migration. A second method to understand genetic variation is to directly cluster the SNP data based on the property that individuals within a cluster would share similar genotypes (and phenotypes).

Mathematically, we work with two variants of the phylogeny reconstruction problem, both of which are NP-complete. The first variant is equivalent to finding a Steiner minimum tree on a hypercube and the second is a generalization of this problem. We solve the two variants in polynomial time when the size of the phylogeny (Steiner minimum tree) is `small’. We solve the clustering problem by finding the max-cut of a graph, a well-known NP-complete problem. We can show that if the graph is generated with the clusters `planted’, then our algorithm returns the max-cut in polynomial time.

Thesis Committee:
Guy E. Blelloch, Co-Chair
Russell Schwartz, Co-Chair
R. Ravi
Eran Halperin, International Computer Science Institute, Berkeley

Comments

Theory Lunch 2007-03-07

SPEAKER: Michelle Goodstein
TIME: Wednesday 12-1pm, March 7, 2007
PLACE: NSH 1507
TITLE: A Two Player Game to Combat Web Spam

ABSTRACT:

We present a novel approach to combating web spam. In the spirit of Luis von Ahn’s games with a purpose, we propose using a two player game to identify spam pages within search results. We loosely define web spam as those pages that are ranked higher than their relevance merits. Our game asks users to classify a page as either highly relevant to a query or not relevant to a query, with the option of passing. We use data from the game as the input to a simple voting algorithm which determines whether a page should be removed from the ranking. We show that the best strategy for users playing the game for fun is to answer truthfully, and that adversarial players have difficulty obstructing the game. We aim to provide a service which functions in addition to automated web spam detection techniques and can help correct errors within a ranking by removing those pages that users find irrelevant.

Joint work with Virginia Vassilevska.

Comments