History of PerceptronsPerceptrons with a single layer of weights can only cope with linear problems. Complex problems are approximated in a rough fashion. These are known as nonlinear problems and can be too difficult for perceptrons to deal with altogether; even the linear approximation is little use sometimes. The problems with perceptrons were partly acknowledged by the original authors, but Minsky and Papert shed light on the limitations of the technique in their 1969 book Perceptrons [Minsky69]. They showed that perceptrons have trouble dealing with higher-order problems. Specifically, they cannot cope with orders above 1 (linear problems). Perceptrons showed that to handle problems of higher order, at least one partial predicate must be supported by the whole space. Intuitively, this means that at least one intermediate unit must be connected to all the inputs. Perceptrons violate this "limited order" constraint, which explains their inability to cope with nonlinear patterns. Minsky and Papert subsequently argued that even multiple layers of weights connected together (or cascaded linear networks for that matter) could not cope with high-order patterns, although they could not prove it. Indeed, as long as the activation function is linear, nonlinear problems are still not possible. Rosenblatt touched upon this in his initial perceptron with Boolean activation functions (they are nonlinear), but could not find a way to train the whole system. This dried up funding in the area, and many researchers moved to symbolic AI as an alternative. |