Evan Martin (evan) wrote in evan_tech,
Evan Martin

a better graph

I finally figured out something that I've been fighting with forever: I have a bunch of data that I want to get the general trend of. My weak stats background makes me think of Kernel Smothing [that looks like a pretty good overview] but in statsland you're always doing it over probability distributions and not counts, like I have above.

The solution is do do it effectively manually: the R dnorm function is the density function of a normal distribution, and then the filter function effectively, uh, convolves that kernel against the data. But I'm not quite sure this is yet correct: does it, for each data point, sum in all the contributions of the points around it using the normal distribution? Or does each point contribute a normal distribution's worth of density to the points around it? I'm shamefully poor at this stuff.

In any case, the above was generated with:
plot(d$date, filter(d$count, dnorm(-40:40,sd=20)), type='l', main='Average Posts per Day', ylab='Posts', xlab='Year', frame.plot=F, lwd=2)

  • memcache job offers

    I get occasional recruiter spam that specifically calls out "my work on memcached". This is pretty funny because all I did was make some trivial…

  • application stack

    "Put yourself in 1995. I'm going to tell the you of 1995 that in 2010, there will be a software platform with the following properties:" Luis Villa…

  • socks5 proxying flash via ssh

    Suppose you're in Germany and want to watch some Flash-based videos that are IP-limited to the US for whatever reason. At first you'd think you could…

  • Post a new comment


    default userpic
    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.