Still thinking about graphs. This weekend I wrote some OCaml: a functional dijkstra’s shortest-paths algorithm, and an accompanying graphviz .dot file parser (first using ocamlp4’s recursive-descent parser, then a ocamllex/yacc parser). The plan was to have dot (or neato or whichever program) lay out the nodes in the graph-- one of its output file formats is “input file with position tags added”-- and then regraph it with only the edges needed to show shortest paths. Unfortunately, I got the whole thing working only to discover that (apparently?) dot (which I was then planning to use on my output files to rerender the graphs) always redoes the node layout.

Working with OCaml is still really hard for me. The whole project would’ve been a hundred lines in a language I’m familiar with, but it ended up near 900 (though admittedly, this is much more rigorous). But if I want to be knowledgable enough about it to even be able to dismiss it, I must continue to try.

My primary observation so far is that the common claim I see about ML-- “Once your program compiles, it does the right thing”-- is very true in practice. That's really awesome and useful. On the other hand, the compiler errors you get can be really hard to track down. (Those two aren’t necessarily related: the first (good) one is a product of a strong static type system, while the second (bad) one comes from the type inference.)

Feel free to criticize; I’m still trying to learn. I also didn’t use any (for example) hash tables or arrays so I doubt I get the same runtime as a real implementation would.

Update: stupid LogJam ate some of my double-primes. Sorry about that.

Working with OCaml is still really hard for me. The whole project would’ve been a hundred lines in a language I’m familiar with, but it ended up near 900 (though admittedly, this is much more rigorous). But if I want to be knowledgable enough about it to even be able to dismiss it, I must continue to try.

My primary observation so far is that the common claim I see about ML-- “Once your program compiles, it does the right thing”-- is very true in practice. That's really awesome and useful. On the other hand, the compiler errors you get can be really hard to track down. (Those two aren’t necessarily related: the first (good) one is a product of a strong static type system, while the second (bad) one comes from the type inference.)

Feel free to criticize; I’m still trying to learn. I also didn’t use any (for example) hash tables or arrays so I doubt I get the same runtime as a real implementation would.

let dijkstra edges src = (* given full edge list and source, return list of outgoing edges from source. *) let rec outgoing edges src = match edges with [] -> [] | Edge(n1, c, n2, _)::rest -> let outs = outgoing rest src in if n1 = src then (n2, c)::outs else if n2 = src then (n1, c)::outs else outs in (* tentative list, new tentative item -> updated tentative list (update the cost if the item’s already there, otherwise prepend it.) *) let rec update_tentative tentative ((nn, nc, np) as newt) = match tentative with [] -> [newt] | ((n,c,p) as orig)::rest -> if nn = n then (if nc < c then newt::rest else orig::rest) else orig::(update_tentative rest newt) in (* one iteration of dijkstra’s algorithm. process all neighbors of the latest confirmed node. *) let dijkstra_iter confirmed tentative = let (cur,cost,_) = List.hd confirmed in (* update_tentative on all neighbors of current node. *) let rec update_neighbors t outs = let update_neighbor t (n,ec) = if (List.exists (fun (conf,_,_) -> conf=n) confirmed) then t else update_tentative t (n,cost+ec,cur) in match outs with [] -> t | out::rest -> let t’ = update_neighbor t out in update_neighbors t’ rest in (* remove and return lowest-cost entry on tentative list. *) let pop_lowest t = let rec find_lowest t ((_,lowcost,_) as lowest) = match t with [] -> lowest | ((_,cost,_) as cur)::rest -> if cost < lowcost then find_lowest rest cur else find_lowest rest lowest in let rec remove target l = match l with [] -> [] | cur::rest -> if cur = target then rest else cur::(remove target rest) in let lowest = find_lowest t (List.hd t) in (lowest, remove lowest t) in (* the heart of the algorithm: *) (* (1) update tentative with all of cur’s outgoing connections; *) let out = (outgoing edges cur) in let t’ = update_neighbors tentative out in (* (2) and pop the lowest-cost tentative onto confirmed. *) let (lowest, t’‘) = pop_lowest t’ in (lowest::confirmed, t’‘) in let rec run c t = match t with [] -> c (* done *) | _ -> let (c’, t’) = dijkstra_iter c t in run c’ t’ in (* need to iterate once so something is on the tentative list to start with. i need a do..while loop, not a while loop. :P *) let (c, t) = dijkstra_iter [ (src, 0, src) ] [] in run c t

Update: stupid LogJam ate some of my double-primes. Sorry about that.

sstricklevan## i knew someone was gonna ask this!

The formatting was just me typing carefully. :PBut the HTML is VIM's "TOhtml" command. I think it uses the built-in syntax highlighter to generate HTML that looks just like your code.

:help 2htmlshould provide more.xaosenkosmosI've been staring at some weird stuff with package management code at work. We have this strange, crusty way of looking at things. See, we want every version of every package that's been installed to be executable at any given point in time. This is a Big Fucking Mess, but we've got most of it down. Think of dynamic libraries and whatnot. You have to know exactly what a given program depends on and hack up your linker flags as appropriate. We haven't even gotten to perl and other interpreted languages yet; i've got a design for kludging up @INC in perl programs; similar code should be all that's needed for other languages.

The real trick is breaking things down modularly. The graph stuff starts leaking in when you're installing "incestuous" packages, like apache and mod_ssl. You have to install mod_ssl for a given version of apache, and a version of apache that supports a given version of mod_ssl. That is, they depend on each other. This makes compatibility testing a bit more complicated than it used to be, with some really tiresome details to keep track of.

I haven't been able to make too much progress on the design because nobody understands it well enough and the boss doesn't want to spend the time on it (?!). Very frustrating, but potentially very pleasant at the end of things.

evanboth because i'm TAing networking (routing algorithms) and because of lj's friends graph. and there's overlap: the researcher i talked to seemed interested in figuring out why people can route stuff efficiently even when they have knowledge of only a tiny part of the graph.