Recursive Predictive Neural Networks

Given that the output of a neural net can be represented as y derived from an input x and a feedback operator f(y) the network can be trained on which may include differential and integral operators in the operator f. As f(y) can be considered to be the feedback synchronization point which is clocked to transit the network forward in prediction, f(y) is delayed in y such as to be f(y(t-1, …, t-n)) is the applied feedback to stop “epileptic oscillation” of the forward net function.

The network itself can be programmed on the sequence to learn in an open loop gradient decent and the bias of x activation to f(y) remembrance by either weighting or digital percent application gating. The pattern to lock onto for an input can be trained independent of an input, and then offset by application of the triggering input to balance activation of one output versus another. The actual spreading and maximization of the output attractors becoming disjunct from instancing which attractor to present as output from input.

The “old” feedback from the “last” remembered thing introduces some chaos and mal-attractor effect. This can be removed a little by using an expected previous context training pre-sequence. This can then also introduce contextual recall. The “short term memory” being the contextual state of y, so programming the long term sequence prediction memory with context y and stimulus x.

The production of optimal context for stimulus itself become a network programming challenge. It represents the concept of changing predictive utility. As the forward transfer of the network produces the output to feedback, the network itself could produce the optimal context from the requirements delivered through part of x deciding the contextual decode mode. A separate net to organize the change of context in bulk would have specialization separation and generation of terms in parallel advantages. In utility though it would only be used to switch contexts, or cross imagine contexts to place the prediction net on a creative sequence.

This could have application when the context is considered a genetic algorithm process for tuning the network to produce some kind of granular attractor synthesis. The process of providing the scoring feedback in synthesis mode controlled by a hardwired concept of misadventure excursion in the prediction. Another network for bad state recognition to complement the entropy generative context granularization network? So the reality predictive network is contextualized, granulated and tested for productive futures. Then a final factorization of synthetic addition requirements of the imagined product can be performed by a final independent network.

Consciousness is within this last network as the self image of adding self as a possibility factor. The production of a threshold of motor action to produce an attempt at achieving the estimated reality granularization (subject to bounds constraints) being the primary motivator.

A Speech Action Co-ordination Domain

If the input x, and the output y with feedback descriptions, current “genetic” gene combinators and more can be serialized as a inter AI language, the projection of multiple “conscious” entities in the predictive net of reality simulation can engage in a factors for product optimization as well as other non zero sum optimizations. A net to process one internal representation to another with an acknowledge of simultaneous state with confusion feedback. At higher data rates a negative acknowledge protocol can take over with estimations of animism action between confirmation certainty with residual accidental error bounding.

A Survival Function

The selection basis of the context provided to the reality estimation can adapt to return a higher valuation of the survival “situation understanding” function. This in the real sense is the optimization function for selection of purpose. The reality function just attempts to maximize a correct simulation of reality. The context function attempts to maximize use of granular entropy to increase the coverage range of the reality simulation to increase options of consciousness to action. The action threshold function then decides if the likely action chosen is done, and in a way represents a kind of extrovert measure of the AI.

Component Parts

  • Reality simulation (estimation)

  • Reality factorization (situation)

  • Granular imagination (context)

  • Action selection (desire)

  • Input processing (percept)

Using some kind of Fibonacci growth connection in a surface topological toroid? That would be more on hardware interconnect optimization. Of more interest to the feedback in the reality simulator would be the parametrized operators building differential and integral representations from the feedback. Of the three forms of end point integral, all could be represented. The fact that the log kind has complex series to evaluate, and has no necessary complex log representation might be an added difficulty but would “lock” onto such functional time generatives.

Negative time offsets on the end point limit on such integrals when complex processing is applied introduce the idea of the 2*pi synchronous summand based on angle, as this maybe a better input controlled output representation of the complex domain for an N:1 mapping. A Gaussian distribution of error about the coefficient division.

Chaos Measure

The feedback operator f depends on calculation of differential and integral functions based on weighted sums of y at various t and so it could be said that any initializing or changing of the reality simulation to another play back “granule” has some new data placed in the feedback memory. This new data can have a varied impact based on the likely-hood estimation of the time samples having an impact on the calculated differential and integral values along with sensitivity to the feedback signal. This implies each memory bit has some measure of bit change (in a genetic algorithm mutation) on the divergence from the reality simulation. This then can be used to infer a focus mask. The use of gene crossing focus weighting or masking then synchronously produces a chaotic deviation from the training reality.

Modulation of the stored memory context would appear on some level equivalent to altering the coefficients of the estimates for differentials and integrals, but as the chaos measure is a deviation control from an exacting physical model of time evolution, it is thought better to keep the operator mathematics at a static precision, and deviate granularity by memory modulation.

For example 1, -9, 36, -84, 126, -126, 84, 36, 9 are the coefficient to predict the future next sample from the previous nine samples based on a zeroth differential estimate. In open loop training the feedback would introduce a delay step, but prediction of the future would in effect cancel this delay so that effectively the f(y) does not have to be calculated and y can be used. The large range would create some oscillation as the context shift registers were filled with data to feedback. This open- loop programming without reference to f allows pre-training without any feedback instability but with a later oscillation about the manifold.

Computational stability requirements are improved if the feedback f is amplified by default expectation, as this forces some non-linear mixing of x to reduce the net summand, moving the bode point of the feedback away from the inactive denormalized zero value. It also increases the net feedback applied to keep the reality simulator feed forward gain below one.

All n orders of differential can be cast as future predictions, and all the integral accelerated forms can be represented with future casting into any t with some renormalization possible but not essentially a necessity. In fact a rectangular offset in the y-axis integrates as a ramp addition to a monotonically increasing sum. Can the network learn a root finding algorithm for applied integral time when wired with learnable pass through of a variable integration time? This time offset from the future prediction time (integral offset time) u can be fed into the operator f and passed through as f(y(t(n)), u(t(n))) with some of the prediction y being used as u.

Alias Locking

In any synchronous DSP circuit with non-linear effects the requirement to keep x and f(y) within the frequency range where alias distortion would potentially present as false signal does indicate that the coefficients could be modified to provide an alias filter. But it maybe found that a small chaotic dither dithers the aliases further and leads to a wider band spreading about an alias. The detection of a coincidental alias may aid detection of the signal expected. This extra minimal noise could be extracted from the environment by deviations from expectation. An AI task of removing aliases may be considered as something that could be learnt, but also generating an inverse filter to supply the alias spectrum (excluding sub-harmonics of the clock rate).

Consciousness as the Correlated to Self Action

When the self action of the model produces a correlation in the reality simulation it could be said to have observed a correlation to self in the model. The relation to the situation factorization domain then becomes an obvious connection to equation of virtual actionals given the real actional set. This allows futures, and past observational training. The weighting function of physical error cutting a cookie of size survival plus some splurge.

So it seems “pain” or some milder proxy for bad function should increase situation recognition, reduce recent action, increase the accuracy of reality simulation, improve the percept and perhaps change the context toward know safe positives. An autonomic bypass from the percept to counter action is likely also “grown”.


The situation analysis net is likely better functional with some feedback. The purpose of this feedback in not time evolution estimation like in the reality simulation, but the use of the factorization of the situation in building a system of meta situational analysis which could include self consciousness. Technically the feedback could be nested recursively and be applied as part of the x input of the reality simulation, but that makes for more complex training. 

Considering that many factorization domains have a commutivity structure it implies that post convolution might be a good way of splitting the network result into “factors”. This is placing the convolution as the last layer and not the first layer.

Or FFT for that matter, and in some sense, this layer becomes the first layer of the action decision net of desire.








And the variational encoder ratio for optimal mixing of the networks?




Variational auto-encoder. Maximal representation of externality. Normalization average.


Time evolution feedback via calculus operators.


Produce genetic algorithm modification for estimation feedback.


Variational auto-encoder with post convolution or ideal order factorization of variation and causation tree.


Threshold action sequencer. Classifier with threshold.

The unity of consciousness as that identified with the knowing of multiple action paths in the imagination as capable of altering a future percept and certainty in achievment of a happy context and situation.

This extends on to the idea of emotive functor attractors as the controlled mechanism for genesis of output from the actional desire. This separates desire as an actional devoid of emotion, in complex with a driving emotion set. What has become of the splurge of biological evolute on the smudged cross product? Does it really assist functional understanding of the power efficiency of self action?

The situation analyser in performing a domain factorization, applying a feedback and estimation of a rule and a correlative later situation could in principal assist with modelling from rule followed by implication of rule. The Gödel incompleteness of the inferred logic controlled by “your stupid” and the implicant “fix yourself” as a splurge cull.

The convergence of the multiple series for different integral forms have bounds. These could be considered some sophisticated parallel to attractor convergence in fractals. As they have a possible intersection as well as a pseudo digital behaviour (time analytic of halting problem applied to divergence) they can be used to represent some digital manifold, while maintaining series differentiability. This implies c(y) and f(c(y)) more importantly be fed back to the estimation.

The separation of the percept before the estimation in a real sense is the great filter. Some post situation feedback would help. The log scaling is perhaps also quite important. Considering an exponential half life maybe controlled by production of an enzyme to remove the metastable precursor to reduce it, the multiplicative inverse is quite likely (Newton-Raphson approximant) and integration make for a log scaling possibility. Some feed forward of x provides entropy and some exponentiation or other series decompositions might be useful.

Author: Jacko

Technical. Well is mass information conservation the reason for dark energy via uncertain geometry and photon exchange? Is dark matter conservation of acceleration with a gradient field heavy graviton? Does the KODEK work yet?