Maybe wrong is a reasonable replacement for spam.  Not using my spam
   tag seems a little worrisome.   Wrong might call out my attention to
   not use it if what i'm saying is clearly right.
   I worked on gnuradio blocks for a bit!  Yay!  Radios are exciting.
   I'm stuck around my cognitive inhibitions with a stats problem.  I'm
   trying to estimate which among a set of populatiom histograms a sample
   is most likely to fit, and I keep on freakin' out trying to make my
   brain do it rightly.  Really i'd like to give a number to each option,
   so the user had an idea of how certain to be, and about what.
   I have two half-clang heuristics. One is to use a bernoulli
   distribution for each bin to give the portion of samplings that would
   be a worse guess for the bin, and take the product of that metric for
   all the bins.  I don't remember it well: it gived a nice similarity
   metric from 0%-100%.
   The other half-clang heuristic is a matrix solution.  I make a matrix
   out of all the population histograms, and solve for what vector
   multiplies them to get the histogram in question.  This comes out with
   a nice result where 1.0 occupies choices that are precisely the same,
   but makes incredibly poor guesses for situations where there is no
   really good choice.
   I made the heuristics while playing around with actual probability and
   statistics, but it isn't quite cognitively working for me.  At this
   point I can barely think about it!  I can barely review my existing
   heuristics, even.
   Ideally, I'd like to output the actual probability of each histogram,
   being the one among the set, that the sampled histogram was sampled
   from.
   I've never taken many probability or stats classes, and the few I did
   take seemed to be just parroting things that were already obvious, so I
   didn't attend to them well, and don't have much experience or training
   with these things.
   When I google things like this, it usually tells me to go through a
   song and dance involving confidence intervals and significance and
   such.  These are things I value, but as I guess I was a hacker I really
   value understanding the things I use, and picking the best solution
   based on that understanding.
   One thing i've noticed that makes it a little easiet for me, is that
   stats descriptions can leave out the properties they are describing the
   statistic of, which can make it more confusing.  The probability of
   something happening is different than the probability of your guess
   about it happening being right, which is different than the probability
   of it happening if undescribed information about it is known, etc etc.
   A stats page might mention the distinction once and then assume
   everyone remembers, and that's hard for me nowadays.
   Notably, the probability of something happening given data, is
   different from the probability if it happening in the real world.
   That's confusing to me.  It could be fun to model everything I measure
   to give it a good prior thingy, but that's not even what I want: I
   don't want to know what is most likely overall, I want to know what the
   data indicates is most likely.  I'm not totally sure how to think about
   that, but the fundamental concept of summing favorable events and
   dividing them by total outcomes clearly assumes a uniform distribution
   of outcomes, and my brain does that too when I consider things around
   me.  If you want to make a fair comparison of things, assume they have
   a uniform distribution so the comparison acts fairly.  I don't really
   know.
   So, if we want to figure out the likelihood of one histogram being made
   from another among a set of distribution histograms, we could obviously
   enumerate all possible histograms that could be sampled from any one
   among the set, count how many are the same as one we sampled from each,
   and divide the number for the one in question by the total that are the
   same.  I think I'm leaving something out there, statistically, when I
   consider it, but it seems like a helpful grounding point.  The goal is
   clearly possible.
   With small histograms and relatively few samples, it would even be
   possible to simulate the above brute-force solution, to give the
   probability a histogram fits among a set, chart some data along every
   variable, and empirically derive an equation of probability.
   But it seems like such a basic thing that, had I the education, I
   expect there would be some formula solution one would know from the
   problem.