Hi! I need some help with math. Or statistics. Or something.
Conversation
Notices
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 06:47:06 JST Evan Prodromou
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 06:47:33 JST Evan Prodromou
So, I have a group of N objects. I want to pick out the M "best" objects, according to non-obvious criteria.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 06:47:54 JST Evan Prodromou
I have a comparator function F that can take two objects, A and B, and return the "better" object. Unfortunately, the function is shitty, and only accurate X% of the time. (X is around 70%, say). It is deterministic, though -- it returns the same value for the same pair of args every time.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 06:51:12 JST Evan Prodromou
Fortunately, my comparator function is pretty cheap to run; I can run it as many times as I want. Ultimately, you can think of the output as a big set of tuples like (first, second, better) -- like (A, B, A) ("between A and B, A is better"). There would be N * (N - 1) such tuples.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 06:54:06 JST Evan Prodromou
I *think* can cheat here by using lots of comparisons in my selection mechanism. The chance that any comparison is wrong is 1-X%, but the chance that any two comparisons is wrong is (1-X)^2, and three comparisons being wrong is (1-X)^3, and so on. I'd like to figure out a way to use this to my advantage.
-
Embed this notice
bjb :devuannew: :emacs: (bjb@fosstodon.org)'s status on Wednesday, 26-Feb-2025 07:00:59 JST bjb :devuannew: :emacs:
@evan Is the comparison result supposed to be transitive? Ie (ignoring incorrect results) if (a better than b) and (b better than c), can we conclude that (a better than c)?
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 07:00:59 JST Evan Prodromou
@bjb yes, ignoring incorrect results, they *should* be transitive.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 07:08:28 JST Evan Prodromou
@bjb so one mechanism is to use multiple comparisons. Let's say that objects "really" compare by alphabetical order, and the comparator's accuracy is about 2/3. Here are some comparisons between A and E:
(A, E, A)
(B, E, B)
(C, E, E) <- incorrect
(D, E, D)
(A, B, A)
(A, C, A)
(A, D, D) <- incorrectSo, the evidence we have for A > E is:
(A, E, A)
(A, B, A) and (B, E, B)These don't help:
(A, C, A) and (C, E, E)
(A, D, D) and (D, E, D) -
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 07:10:09 JST Evan Prodromou
@bjb There are multiple pathways through here, though. I guess one way to do this is to count all the paths that show one conclusion, and all the paths that show the other conclusion, and whichever one is bigger, that's your real comparison.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:04:05 JST Evan Prodromou
So, I've been noodling with this for a while, and here's where I'm at. Code here: https://github.com/evanp/inaccurate_comparator
Basically, I realized that what I'm looking for is evidence in the output of a "real" difference. And that evidence can be direct or indirect: either I get "A > E" as output, or I get "A > B" and "B > E", which given the transitivity of the (real) operator, is also good.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:10:19 JST Evan Prodromou
So, what I want to do is look at a bunch of comparisons, and see how many ways I can show that A > C, C > D, D > E, and so on. And, in the opposite direction, E > B, B > A, or whatever.
And I kind of have the intuition that the more "proofs" in one direction than the other, the better. And the longer the proof, the better -- errors should introduce noise, instead of linking up and reinforcing each other.
-
Embed this notice
Bruce Elrick (virtuous_sloth@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:11:15 JST Bruce Elrick
@evan My gut wants to construct a directed graph with nodes being your letters and edge weights/direction being your comparitor function.
Then apply some graph algorithm.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:12:28 JST Evan Prodromou
So, I did a little generator that models comparing alphabetical letters and occasionally gets it wrong. It outputs a CSV file showing the supposedly "greater" letter first. Here's a run with the first 5 letters of the alphabet:
A,B
A,C
D,A
A,E
B,C
D,B
B,E
C,D
E,C
D,EThis has about a 30% error rate.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:17:55 JST Evan Prodromou
Then, I wrote a little comparator function for any two letters, X and Y. It makes a directed graph, and then determines how many total paths there are from X to Y, and from Y to X. For the above data, it finds these paths:
[['A', 'B', 'C', 'D', 'E'], ['A', 'B', 'E'], ['A', 'C', 'D', 'B', 'E'], ['A', 'C', 'D', 'E'], ['A', 'E']]
[['E', 'C', 'D', 'A']]
Note that there are 4 "real" paths and 1 "bad" path from A to E, and only one "bad" path from E to A. I use the sum of path lengths to compare.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:19:42 JST Evan Prodromou
I don't think this is really rigorous -- it's just based on some handwavy idea that it's hard for wrong comparisons to line up and give good proofs, and it gets very, very hard the longer the paths get and the more paths there are. I should probably spend some time proving it, though.
-
Embed this notice
Bruce Elrick (virtuous_sloth@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:21:42 JST Bruce Elrick
@evan This is why I want the graph algorithm.
Just like a tournament with 3 teams can have A beats B, B beats C, and C beats A, resulting in an inconclusive result, having the weights as well as a global graph algorithm is needed to get a more meaningful result.
Unfortunately I don't know what algorithm to apply (thinks back to undergraduate... source/sink, max-flow/min/cut, hands start waving).
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:22:01 JST Evan Prodromou
Finally, I wrote a little script to use the comparison and sort the letters. It starts with the letters considered, scrambles them, and then uses the comparator to get them back in the same place. It also "scores" an arrangement by the sum of the offsets of each letter from its proper place.
-
Embed this notice
Bruce Elrick (virtuous_sloth@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:22:22 JST Bruce Elrick
@evan LOL we both wave hands.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:26:02 JST Evan Prodromou
So, for the above data, it gives this output:
original: ['A', 'B', 'C', 'D', 'E']
scrambled: ['E', 'C', 'B', 'A', 'D']
scrambled score: 10
sorted using compare(): ['A', 'D', 'C', 'B', 'E']
score of sorted: 4The output isn't blowing me away, but at least it's better than randomness. You can kind of see that the D and B are swapped, which seems... close to correct?
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:33:11 JST Evan Prodromou
So, here are the bad parts:
- That justification for comparing sum of total path lengths as the comparison is a little lame. I need to put that on more solid ground.
- The output is OK, but not really stunning. In particular, there are a lot of swapped letters right next to each other.The biggest bad part, though, is that finding *all* the paths between two nodes in a directed graph is an exponential problem. I can run with 5 letters just fine, but 26 letters doesn't work at all.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:36:07 JST Evan Prodromou
I think I can fix that in a couple of ways. One is by doing binary search, which should cut down on some time.
A second point is that I don't *actually* need to know *all* the paths in each direction. I just need to know which direction has *more* paths.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:37:45 JST Evan Prodromou
So, I might be able to get away with doing a breadth-first search through the graph, and do the 2-node paths first, then the 3-node ones, and so on. And once one direction has an overwhelming number of paths, I could just bail out and say, hey, there probably aren't enough very long paths here to make a difference, and it's not worth trying to find them.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:41:38 JST Evan Prodromou
For the record, the actual problem domain for me is climate change actions for cities, like putting in bike lanes, electrifying the bus fleet, or having a heating oil to heat pump rebate. My team made a heuristic comparator for actions (reduced emissions/cost, applicability for city bioregion, side benefits) but it only gets about 70% right on comparisons compared to human experts. We also trained an ML model based on the expert feedback, and that does better but not great.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:42:42 JST Evan Prodromou
We have thousands of possible actions, and hundreds (soon thousands) of cities, so it's not feasible to have a human consider every one. We just want to get a set of top candidates for a city team to choose from.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 12:43:51 JST Evan Prodromou
@virtuous_sloth Yeah, that is what I ended up doing! Read the rest of the thread!
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 13:03:57 JST Evan Prodromou
Another thing about this process, for us, is that it's less about getting the actions in exactly the right order, and more about getting as many of the "real" top N actions, in any order, into the top N output. And for now this sorting process might be enough.
-
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 13:04:41 JST Evan Prodromou
@drawnto yeah, if there was some systemic error we could identify it and correct it. For now, let's say it's uncorrelated.
-
Embed this notice
drawnto (drawnto@woof.group)'s status on Wednesday, 26-Feb-2025 13:04:42 JST drawnto
@evan
Is the likelihood of the wrongness of comparison completely uncorrelated to the symbols compared?
If not this gets a lot harrier. -
Embed this notice
Evan Prodromou (evan@cosocial.ca)'s status on Wednesday, 26-Feb-2025 13:06:07 JST Evan Prodromou
@drawnto I don't think it's a knapsack problem, because there's no budget on the actions.
-
Embed this notice
drawnto (drawnto@woof.group)'s status on Wednesday, 26-Feb-2025 13:06:08 JST drawnto
@evan
This is actually a knapsack problem then.
I recommend you to look into that instead. You are doing a lot of thinking on how to approach writing a heuristic (and suboptimal) solver for a knapsack like problem.
Maybe considering it from the knapsack perspective helps. -
Embed this notice
Bruce Elrick (virtuous_sloth@cosocial.ca)'s status on Wednesday, 26-Feb-2025 13:08:40 JST Bruce Elrick
@evan Yeah, saw it going. Good luck!
-
Embed this notice