Note: I use AGI and Strong AI interchangeably, both stand for human-level artificial intelligence
It seems almost safe to say that the A.I. hype around deep learning is a sure sign that we’re out of the A.I. winter. Almost. I think the current hype is based on effective, but ultimately narrow A.I. paradigms. Narrow or weak A.I. is brittle. This means it breaks down outside of a particular scope or domain for which it is specifically designed. It remains to be seen whether we will end up in another period of disillusionment or bridge the painfully significant conceptual rift between narrow A.I. and strong AI during this hype cycle. I think it won’t happen without very well funded and dedicated research efforts. Unfortunately, despite the recent rise of interest in A.I., the field of artificial general intelligence remains woefully underfunded.
Analysis of the reasons for the lack of funding of AGI is outside of the scope of this blog post. One reason is that building AGI has always been considered a (too) daunting challenge. After initial excitement and positive prognoses mid-20th century, it became clear building general intelligence is not quite the walk in the park some declared it to be. Nowadays, some still claim it’s impossible. I disagree. However, let there be no doubt that the pursuit of building AGI is indeed fiendishly difficult. In fact, I think it is among the top three intellectually demanding pursuits in science and engineering. The road to AGI is paved with great difficulties. In this post I will discuss one of these difficulties: the danger of avoiding models in your AGI design. They are handy for narrow AI, but deadly for AGI. I think this is an important piece of the puzzle. Nevertheless, it is not without its own set of pitfalls.
Models are not evil
Let’s get something out of the way: models are not evil. However your AGI architecture is constructed, you have a model for it. Whatever your idea of general intelligence and learning is, you have a model for it. Yes, you should avoid feeding your AGI system models in its learning domain. This is because it should be able to acquire its own models through perception, just like we do. If it can’t do that, it’s not general intelligence. Yes, you should understand the difference between supervised learning and unsupervised learning or top-down and bottom-up designs. (Note: these concepts are more loaded in context of AGI than narrow AI). However, there are many levels of description and organization to define and models are not unequivocally bad for all of them. If I give you (for the sake of the argument) a correct model, such as a blueprint of a generally intelligent artificial being and the means to construct it, then each and every accurate diagram you might draw of this being, such as the the photo at the top of this post, is absolutely correct too. There won’t be any danger in over-specification here, because we’ve established it’s correct. As such, it is not to be discarded simply because it’s a model. This may seem crude and banal, but the question here is: “Is it sensible to discard models from virtually all of the levels of organization in your system, besides the learning domain? Some think so. This is because the mind is considered too difficult to model and we know too little. Unfortunately, this is not as sensible as it may first appear. Worse, as I will argue later on, it might tempt you to oversimplify your system to the point of rendering developing general intelligence intractable: your system might never be able to bootstrap itself into a robust general intelligence because of it.
The difficulty of modeling a mind
So, when unpacked, “models are bad” is more lucidly expressed as “There is no way you can model [what is deemed necessary for AGI], so don’t even try”. I think this argument, once steel-manned (strongest version of the argument), has some merit given the current levels of ignorance in artificial intelligence and relevant domains. It’s quite hard to fully disagree with this. It’s not difficult to demonstrate just how little we know, even today. On the other hand, I would strongly warn against discarding lessons from the wondrous and advanced machinery that is biology. Among other reasons, I say this because you are still using models and in more ways than you may think, whether you admit it or not. You are modeling the human mind. If you are not doing it explicitly, you’re doing it implicitly. It’s our best example of high general intelligence. Even if you think most aspects of human minds are irrelevant and poorly understood, you’re still modeling intelligence or learning, no matter how you spin it. And that is where the problems start.
The separation of “model of the world” and “model of the machine that is supposed to model the world” is in itself an abstract distinction between two abstracted domains. This distinction is not only fluid (depends on your design, taxonomy, etc.), but it’s also fuzzier than you might think. The difference between force-feeding an ontology to your AI and constraining its learning so that you have implicitly fed models to it, is a very hard one to make. The architecture of your machinery, your hyperparameters, the type of data you feed it are all potentially contaminated by models (of the world), directly and indirectly. Adhering more to the East rather than West pole in the East-pole West-pole divide can make all the difference. Implementing some idea of Chomsky’s universal grammar to achieve natural language processing can make or break your system. Yet the questions in these domains are largely unanswered. This is a lethal gray area.
The hidden pitfall of model avoidance
To you avoid all uncertainty you go with what you think you know: a tabula rasa (blank slate). But the blank slate is still a slate. And its properties are in your hands. I have previously expressed this as “Even a tabula rasa needs a tabula“. Your tabula is still a model. Deflating it to maximize its evolvability is a good instinct, but it is not necessarily a sensible decision. You abstract away lots and lots of features of the human mind you deem inessential, to model that which you consider the crucial mechanism to acquire models. How sure are you about abstracting away all those features? Is the lack of understanding them a good reason to discard them? Are you sure you’re not throwing out the baby with the bathwater? The irony here is that in zeal to avoid crippling your system with models, you just might end up crippling its ability to evolve general intelligence.
So your AGI system affected by what you choose to implement as much as by what you choose not to implement. And you may be implementing too little. While you have many advantages over evolution, such as running countless iterations easily, it’s not as easy to put in the right parameters. Your means of input and environmental constraints are limited, compared to the evolutionary pressures that formed biological systems. Almost 3.6 billion years of evolution have shaped biological machinery. The design of our brain and body has been evolved by ever-changing landscapes of evolutionary pressures. The richness of these evolving environments is far beyond anything you are feeding into your system. It’s safe to say evolution is the mother of all optimization processes. We can’t forget, however, that after 3.6 billion years, it has produced many examples of sub-human intelligence (all non-human animal species, extant and extinct). Even in light of remarkable examples of animal cognition, such as the behaviors of the smartest of animals like Cetaceans or Corvids, it’s hard to dispute that humans are exceptionally intelligent. Yet homo sapiens is only one species among millions. Thus, millions of species have not evolved the human-level general intelligence, but went through the same unfathomably complex evolutionary process for billions of years. That should worry you. It should also help you realize, that the right (as elusive as ‘right’ can be) model can be extremely useful, when facing limited ways to evolve your system. Evolution is a very, very tough act to follow. This means you have a fiendishly tricky trade-off to make between the complexity of your AGI system and the complexity of its environment and data. In my opinion, this trade-off is currently very poorly understood and avoiding models is no panacea. It’s only one piece of a complicated puzzle.
As in any endeavor with very difficult open questions aplenty, expert opinions on what is necessary and sufficient for AGI are very divided. You need to acquire a better understanding of what you’re trying to build. By this I don’t mean working on meta-heuristic procedures to gain more insight into your neural net. I mean interdisciplinary AGI research. You can get away with computer science knowledge when building narrow AI, but building AGI is the most interdisciplinary pursuit there is. Computer science and bits of neuroscience will not suffice. I will cover that in a future post. For now, I leave you with two questions every AGI developer should never stop asking: “What have I abstracted away, and why?”