11:35 am, 1 Dec 03
lj analyses
- The LiveJournal interests graph is odd: each user is linked to a bunch of interests, and each interest is linked to a bunch of users (in this representation, all links are bidirectional). No users are directly linked and no interests are directly linked.
Two subset interpretations:- Make the nodes exclusively users and the links to other users based on shared interests. Now you can find a "what do [a] and [b] have in common?" based on shortest paths.
- Make the nodes exclusively interests and the links to other interests based on users who include both interests. (This graph may be more useful than the one above if the number of interests is below 20k or so, because then we can do full graph computations on a normal computer.) Make the link cost a function (inverse) of the number of users who include both interests. Now you can find the diameter (and the maximally-distant links) as the "most dissimilar" interests; users who intentionally pick dissimilar interests with the intention of messing up the graph will only add a high-cost link.
Figuring out who in general has "interests in common" with a given user is a different problem. I recall some stuff from my stats class about relative information (a common interest like "music" is much less meaningful than a specific one like "debian") but I don't remember the details of how it works. :( - The shortest-path problem between two users or interests is much easier to solve if you're only computing for two specific users. A breadth-first search from both ends at the same time ought to be pretty quick;
I imaginethis is how LJ Connect actually works. - I'd really like to run some imaginary "flow" algorithm and figure out who is the "most popular" (high in-degree in friends links isn't as meaningful as "valuable" links-- think Google's pagerank), but I don't know anything about how those work.
LiveJournal Interests graph
Just out of random, I thought I would let you know that this graph is actually being shown in INFO 424 by Prof. Dave Hendry as part of the course to aide the students in understanding networks visualization. =)Re: LiveJournal Interests graph
Which graph?I think you're looking for the mutual information measure. But I don't have the textbook in front of me, so I may be remembering wrong.
My class in pattern recognition might be interested in this data too...
silly question
do you know of a place where all the people who are doing lj analyse stuff hang out? there is the trust thingie, lj connect and some others i've seen but i'm interested in something really comprensive any pointers would be great.Re: silly question
no, but it'd be great if there was one.Re: silly question
i've been thinking about creating a community about lj topography questions, because i have a lot. but, i'm not really sure the interest is there to really generate a lot of discussion. Have you seen any 3d renderings of friends' networks? i was looking into elegant ways of displaying friends networks/groupings/what-have-you and found graphviz (,the de facto standard,) to be rather ugly. i'm sure lj couldn't handle the load of rendering 3d maps for people, but i was sort of thinking, as fast as lj grows (in users)... the friends map doesn't grow nearly as fast... maybe one could cache the results of a clever placement algorithm... well, i'll stop babbling here.