Scenius Demo, Chat. Jan 2, 2025

Kristen	Moving qualitative to quantitative as a general concept is fascinating to me - what’s concealed/what’s revealed in the process?
Jon	+1 also does an llm do this? Or direct embeddings? What works well/ what doesn't?
svitlana-ing	Yes and what information gets reduced away on the way?
charles adjovu	It should just be embeddings.
charles adjovu	But of course, LLMs rely on embeddings.
Daniel Friedman	The centroid cannot hold.
Kristen	What exactly is the “centroid”?
svitlana-ing	What would centroid mean in this context?
“a centroid is the geometric center or mean position of a group of data points in a multi-dimensional space.”	Wisdom of Crowds tech, suggests independent judgements then unbiased aggregation (average) is accurate. In this context, centroid is being used as the average. Whether it is the average is up for debate. But centroid is python code readily available and underpins data science.
svitlana-ing	Is it like a uniting denominator of all the enbeddings/ideas — so perhaps philosophical smth (philosophy tends to be central node often, e.g. in wikipedia)
brianL	Any reason the centroid isn’t ~exactly in the center?
Daniel Friedman	Here it depends what the Axes are.
For example if the X and Y axis are the Principle Component (PC) loadings -- then the Centroid (near [0,0]) is the idea closest the middle of the PC embeddings.
Daniel Friedman	PC is a simpler linear approach to the Embeddings which e.g. LLM use.
Daniel Friedman	ELBO = Evidence Lower Bound = (negative) Variational Free Energy
Daniel Friedman	Finds the number of clusters K, which minimize free energy in terms of (Accuracy-Complexity), equivalent to BIC https://en.wikipedia.org/wiki/Bayesian_information_criterion
Daniel Friedman	Something with low similarity can be novel, not necessarily spam.	The original purpose of the simscore project was to develop a wisdom of crowds aggregation method for subjective written opinions. So Novel ideas were not the focus. The use case for Wisdom of Crowds Tech is aggregation of independent opinions in order to determine the an accurate collective opinion. Brainstorming would be a better solution to discover novel ideas.
Daniel Friedman	Prioritization, that seems like related to the Temporal/Dependence/Urgency of the task. Not sure how that would simply relate to how Mainstream the idea is	Yes. A better word could be “ranking” vs the collective opinion. For prioritization the urgent/important matrix would be better. The use cases envisioned are for “forever questions” rather than uses cases where urgency / importance are in question. A forever question could be “what are our long term goals?” While an urgent / important question would be “what do I need to do tomorrow or put off til next week?”
Daniel Friedman	There could be more points skewing to the left in that embedding space.
So if the Centroid is a geometric mean, it would be to the left of (0,0).
Kristen	Mathematical representation of similarity
Kristen	Clusters
svitlana-ing	Here is link to sim-score website for play https://sim-score.vercel.app/session/6758eceb6bd63ab08b904421
Gabriel	re: what is a centroid: It seems like it’s an abstract stand-in for what the "thing" is that people are “discussing around”. I’m thinking it’s the elephant in the parable “blind men and the elephant”
Ronen Tamari	Thinking about how many iterations are run - sometimes the solution emerges through process, sometimes a one off vote is enough.	Centola method, suggests running 3 sessions in sequence and judgements will converge. Participants learn from each other.
Gabriel	Yeah seems like it would follow be cool to do multiple rounds with the goal of getting everyone closer to the centroid.
Ankit	Thinking about the tension between groupthink and outlier ideas. Is the expected behavior to treat the centroid as canonical or the mathematical synthesis? Is it appropriately valuing “fringe” ideas?	I don’t know
Kristen	Is there a way to find where there are “frictions” or disagreements?	Yes, they can be in same clustter
Gabriel	Wisdom of the crowds vs. folly of groupthink	This statement is the crux of the matter. If we allow open debate we get anchoring, herding and groupthink. Intuition says the debate is the hard work. It is hard work. But so many things go wrong. Who speaks first, who speaks loudest, social influence, lack of repeatability and power dynamics can lead groups astray…Madness. So for large diverse groups, decisions making based on the simscore ranking system will be more accurate, eliminate noise (variability in judgement.
charles adjovu	https://dev.to/anurag629/centroid-based-clustering-a-powerful-machine-learning-technique-for-partitioning-datasets-41im
Daniel Friedman	The clustering is an "is".
What to do with distances, is an "ought".
For example someone could be interested only in the periphery.
Daniel Friedman	Yes for what Thomas said.
LLM embeddings are many-dimensional, because 2-D representations of natural language strings, give Absurdist outcomes.
Kristen	Wrt to the reductive nature of a process like this - I’m wondering when it’s best to use this kind of approach?	IMO 1 large diverse groups, 2. forever questions, 3. discussion threads 4. time constraints 5. where iteration is possible…ask same question again in intervals (eg 6 months)
charles adjovu	Yes. For example, sentence-transformers gives embeddings of 384 dimensions.
Ankit	Currently using 1536 dimensional vector on the project im working on. Compute time goes up but supposedly so does fidelity. Seems like there are diminishing returns at some point
Kristen	Yes - this is the “friction” I was getting after above - @svitlana-ing and I have chatted about this a bit, some friction can be very productive in certain scenarios	Within orgs, ppl interact with each other and the orgs systems. They develop opinions. Wisdom of Crowds Tech is ranking the group’s opinions about a question at a particular time. At that time the friction points aren’t known. No anchoring has occurred. Everyone sees the output together. It is important that the decision process is agreed at the beginning. But in all cases, getting the independent and aggregation in advance of discussions or actions will yield better results.
Gabriel	Seems like having a specific prompt with some constraints on format of the answer would lean itself well to this.
charles adjovu	Some use-cases here: https://dev.to/anurag629/centroid-based-clustering-a-powerful-machine-learning-technique-for-partitioning-datasets-41im Mostly when you need to group items.
Daniel Friedman	Interestingly, the "point of diminishing returns" in terms of Information Gain from adding dimensions, is the ELBO/BIC/VFE mentioned above.
charles adjovu	For example, if you needed to group content under different subjects or categories
Daniel Friedman	K Means Clustering
https://en.wikipedia.org/wiki/K-means_clustering
With K=1
Daniel Friedman	Or you can use variable K
Ankit	Curious how the newer models like o3 chain “private thoughts of reasoning” to derive any additional insights for something like simscore
Ronen Tamari	I’d be interested to hear more about differences with polis - when to use polis and when to use this	Set up meeting request for a Thursday session
Gabriel	I would love if the graph was more interactive so I could interact with clusters within the cluster etc.
Artem Zhiganov	there's also Ethelo

Ronen Tamari	This feels important
charles adjovu	Good question.
Gabriel	I wonder if it’s more for decision-making and prioritization rather than sensemaking per-se	This is not sensemaking. It is decision making and ranking. Cognitive processes taken in sense making and learnings, decide, act, learn, sensemake, decide and act again. it is iterative.
Daniel Friedman	Ya awesome. I am also very interested in the information content of the Chain of Thought "private" strings, and their lengths, in terms of meta-cognition (How long & How to think?).
charles adjovu	Some interesting writing on this subject here: https://medium.com/@apiary/assuming-consensus-how-socio-technical-assumptions-are-influencing-decision-making-in-the-age-of-8d20fc73f0a8
Ankit	sometimes i struggle with knowing how long to think too…probably a very AGI-ish level question lol
Daniel Friedman	AGI was the externalized meta-cognitive technique we sought all along?
charles adjovu	@Shahar Oriel https://medium.com/@apiary/assuming-consensus-how-socio-technical-assumptions-are-influencing-decision-making-in-the-age-of-8d20fc73f0a8
Ronen Tamari	Thanks, looks like a cool ref!
charles adjovu	Very good and fun read!
Gabriel	Would be really cool to generate a “centroid summary suggestion” based on the most resonant points.
Artem Zhiganov	btw I interviewed Camille from Apiary for our blog https://blog.harmonica.chat/interview-with-camille
Daniel Friedman	"What should we do next", can also go stale very rapidly.
Whereas "What is the deepest possible direction to head towards" might have more semantic longevity.
Kristen	Can you re-create the centroid based on a prompt? Like I want to interact with this data set from unique perspective at a given time and find a new centroid based on my inquiry
thomas benham	Because it is all just basic math
thomas benham	So more text ie the prompt, is just another ref point
Kristen	At a different level though perhaps?
Daniel Friedman	"We will have learned more after acting", is the essence of epistemic foraging policy selection	cognitive process.
Ankit	An LLM prompt can produce a weighted algo for how to compute pairwise values. For example, if you are more interested in the friction points, then it can weigh those edges higher and the centroid as an answer could shift its position
Gabriel	I could see this giving useful and surprising results by having a question, sending out 5 LLM agents to answer the question, and then “compiling” them with simscore to get closer to an answer that “transcends and includes” all of the answers
Artem Zhiganov	Ronen asked: "I’d be interested to hear more about differences with polis - when to use polis and when to use this?"
Artem Zhiganov	Svitlana asked: "Since prioritization is done based on abstract math-y centroid that isn’t sensed-out much, is it the best way to go about sensemaking?"
Sensemaking is part of the cognitive process. a circle……sense, decide, act, learn back to decide. without decisions or action we can’t learn.
svitlana-ing	What feels missing to me is coherence, the ideas are still sheet rows and you still need to go over them and sense-make / synthesize.	Well, once we decide on the top ideas, we need to act. To act requires another step in the cognitive process. This is then a project management issue…we know what the project is….now we need to decompose it into action items.
svitlana-ing	Thinking of complementary add-on for it: take all the messages, send them to LLM alongside Guiding Framework and ask to summarize/prioritize top three themes in paragraphs using Framework as “centroid”	https://chatgpt.com/share/6777295e-6388-8000-ad8c-7b3028aef0f4
Daniel Friedman	Action policy inference can be guided by Pragmatic value (sensory consequences of action being expected to yield preferred outcomes) and Epistemic value (expected Information Gain of action).
Ronen Tamari	Interested in that question too @Kristen
Kristen	Now we’re cooking!
charles adjovu	@Gabriel - @gabriel_export Technically, you do not need an LLM to do this type of work. So, you could do it for free or at-cost for a server.
Shahar Oriel	The Bias of the models involved can create biases towards certain voters.
charles adjovu	@Paul W Great question.
Daniel Friedman	Can use different methods of augmented sensemaking, from group inputs.
For example any embedding/clustering method could be contrasted with e.g. just pasting everything into Perplexity, etc.
charles adjovu	This seems like a good experiment for all of the sense-making tools in the Scenius.
Artem Zhiganov	let's A/B test different tools
Daniel Friedman	Imagine every neuron in visual cortex voting -- Where should we look next?
Then taking the centroid/mean/median of those vote.
That is an Epistemically-driven action selection.
Whereas if we were polling Pancreatic cells about insulin, it would be more Pragmatically driven.
Artem Zhiganov	I want something like this to replace YouTube comments
Ronen Tamari	Or something similar for thread summarization	Example of Arbitrum Forum….ranked paragraphs from forum posts. Two delegates thanked me for analysis and used it as basis for their decision.
Daniel Friedman	Could have LLM generated texts of different length, for any given Coordinate in the space that someone clicks on.
Ronen Tamari	Lol the centroid is like the black hole in the center of galaxies
Gabriel	We want to cross the event horizon

Kristen	Without getting spaghettified
Gabriel	Yes like hover on the centroid and see a suggested summary of what that centroid “is”.
Ronen Tamari	Down for A B Test

stuff