The importance of 7
Transcendental - a number that can’t be represented algebraically. (using +-*/, or raising to a power, even a fractional one)
7 is an important number. It has an incredible history of cultural significance.
I wonder if with working memory; less is more. Does having a limited capacity to store information require that abstractions be made such that the required complexity is kept within manageable bounds.
The wiki page on the importance of 7 recently came up on hacker news.
I have over the last year developed what I think is an idea that could possibly lead to a general intelligence.
It is my belief that it does not require a large amount of computation to emulate the aspects of intelligence that we attribute to intelligent people. Clearly to be able to derive the function that maps from the space of a 4k video to categories which include a tortoise or turtle is going to be an intensive one to compute. What you are trying to do is something that involves a large amount of information.
Equally, if you are trying to predict a non-linear pattern from some data where the domain from which the data is drawn is large, you are going to need a large amount of computation, at least in trying to discover the function if not compute its results after the fact.
I’m talking more about a very specific intelligence that can be thought of as a form of pattern recognition that can exist in broader domains but with smaller samples. It will do nothing to predict the next element in a time series but might be more akin to what is done in mathematics.
My understanding of why the logarithm is prestigious as the functional inverse of exponentiation even though its output on most numbers is transcendental; is that it can be used to solve so many problems in a way the functional inverse of a randomly chosen quintic function doesn’t. (Power 5 polynomials, in general, have transcendental solutions as opposed to order 4, 3, quadratic and linear).
I think there is an importance in the ability to reason about things that can’t be represented directly. Language has the power that I can have a conversation with you about “The sentence that can’t be said”. Is it paradoxical that I have referenced the sentence that can’t be said in a sentence? It isn’t! There is no self-reference in the sentence. I’m not saying that the sentence “The sentence that can’t be said” is an unsayable sentence simply that there exists a sentence that can’t be said and that that is what I am currently referring to.
It’s hard to represent this kind of expressive power inside a computer program. Sure computer programs themselves can represent it, but the semantics are hard to grasp even if you are using some form of meta-programming (Writing programs that reason about programs).
The cleanest way in which I can see that these ideas can be represented is through something akin to a knowledge graph.
In this knowledge graph, the nodes represent concepts in the abstract. They exist only as places to point to. It might be that you can link their representation in a given natural language to the abstract concept but the concepts themselves are simply something to point to. They get their semantics from how they are connected to other nodes. The labeles of the connections are themselves nodes. This point is something that is rarely permitted in graph databases. In fact, although many can represent this by artificially adding the constraint that if an edge is labelled with an ID then there exists a node which represents that ID.
What this allows for is not only reasoning about abstract concepts, but to be able to reason about the relationships between those concepts.
Generation rules.
The other thing that I would like to include in this representation of knowledge is generation rules that are inherent to the graph itself. I believe that this requires the existence of a set of base nodes, the machine code of the graph if you will. These might exist as formal propositions such as the existential quantifier(there exists) and the universal quantifier (for all).
These can then be used to define a finite-in-memory yet infinite-conceptually graph that represents the ZF axioms and by induction all the numbers.
This way the implied existence of an answer to a question can come before the answer is known. For example, without calculation, the read can reason that 1334*3992 will have an answer and that answer is a number. Furthermore, it might be noticed the answer will be an even positive integer. All this reasoning I propose is simpler than finding the answer, in pure complexity of each of these facts is constant (in the number of digits to reason about), whereas the multiplication (at least using the method people use) is n^2 in the number of digits.
Mul: N, N -> N
x, y where x, y have final digits in (0,2, 4,8) => x*y member even
These might be facts which in the knowledge graph are represented as attributes of the multiply function.
It might be argued that a computer can already find these by simply doing the multiplication which it has dedicated hardware to do. I would argue the knowledge of these facts about multiplication is deeper than simply calculating. I can reason using the properties of multiplication on positive integers here that the square of graham's number will exist and be a positive integer even though it is not possible to calculate graham’s number.
So we have this graph, how do we direct the reasoning on it to form useful thought and fulfil the promise of introducing something that will allow efficient computation of abstract thought.
This is where I think the number 7 comes into play.
Take the following structure of 6 nodes:
This isn’t anything revolutionary but the rules around how these 6 nodes interact with the graph as proposed might introduce something novel.
What I’m representing here is the comparison between two relations.
Nodes 5 and 6 represent the relation (or at least a relation) between (1,4) and (2,3) respectively.
I’m not set on what the 7th node should be. There are multiple options but my current favouret is that the interaty of the diagram is what the 7th node represents.
Using all 7 nodes is not always going to be necessary and I will go on to propose that the function which is use to progress this system through time favours using fewer of the nodes.
Starting with a simple example, say we have a number represented in the first node and we have the operation increment represented in 5.
This implies the existence of a node in 4. If we can, from the first node search the knowledge graph for something akin to the cypher query (1)-[:5]->(n)
If we have a node representing the increment of the number then we simply load the value in to complete the pattern.
What if the operation doesn’t have a defined value?
If instead of increment we have the operation sqrt in 5 and in the place of 1 we have a negative number then we have a way of defining the square root of a negative number which we can then be used in other ways. We can now talk about things we don’t have an existing representation for.
How do we know if applying seemingly non-sensical operations is worth doing?
I think this question comes back to the argument of why the logarithm is a more useful functional inverse to raise to the heights of a proper function than is any given quintic inverse is the same here. If it is useful for us to define what the smell of pi is or the square of a pie then we can progress with our reasoning.
There is a worrying trend in Silicon Valley of taking micro-doses of psychedelics. It’s a trope that when people are high they are able to, or at least are more likely to ask seemingly non-sensical questions such as what if a country had an imaginary GDP. Sometimes these more distant connections which could be represented as matching functions to arguments outside of their domain is the sort of thing that creative is there to do.
So I have given an example where we have used the LHS but what about the others.
Well, what the others allow for is things like the continuation of a sequence, or inverse operations.
Although I haven’t implemented the system I envision its implementation using a stochastic, monte-Carlo style propagation over the possible traversals through the graph. For at each stage the next fill of the graph will have many options. I think this links somewhat to the post I wrote about the importance of goals for intelligence. I think that without a goal, without some preference over the future states of the world it is not possible for the agent to develop meaningful abstractions for it is my belief that meaningful abstractions can only be meaningful with respect to a goal.
This is where I have struggled to implement the system, positioning the agent in a world where it has a goal. I have toyed with the idea of having it in some classic game and having multiple versions of the agent about to communicate and make decisions about the use of resources.
Manipulation of the world outside of the mind can be done in a similar way as to how some of the operations on the underlying graph are “baked in”. These outputs could also be baked in allowing the agent to interface with the outside world.
Comments
Post a Comment