So if a gradient descent hyper-parameter controlling the learning rate is the usual way, how can this possibly be improved? Considering that in some way the approximation of future gradient alterations is distributed depending on the batch, the stability via an average gives a more stable basis to then infer an accelerated projection of the future descent.
The biggest problem to consider is bound oscillation. When the accelerated projection is magnifying the learning delta to apply such that locality is an asymptotic non-convergent (reverse symmetry in summation acceleration by considering the divergent terms as “merging toward” the first term limit). This then would converge as a metaseries in some instances, but not all. It then becomes essential to scale the approximations by inverse power weighting to make a convergent for highly entropic unstable weights. It may also indicate that weight decomposition may be an effective strategy to obtain a neuron split into the stable (time aligned) and the unstable (time inverted) partitions of a signal.
Assuming the unstable partition has a repellor (opposite to an attractor in chaos), modelling could be used to invert the accelerated projection to the repellor. If the accelerated series is approximated by an integral, the unstable inverse acceleration would perhaps be a reversal of the limits of integration? Or a sign reversal of the limits?
In a sense the splitting of the network into a composition of multiple networks based on partitions related to the number of critical negative signs (or more precisely the number of things that could have negative signs). In this case just 1 sign for a time is like hyper-parameter convergence property. The algorithm after decomposition can then be specifically optimized per partition.
So given that the future estimation could be trained on data from a delayed assumption state from the past prediction of the present, then what is missing? The missing seems to be based on the time factorization process NP problem and innovation stimulus which would cover things that are unknown within the net as well as time relevance which was not compensated for (the delay has an opportunity to sample lesser pasts for greater present prediction but produces nearer futures without doing Monte Carlo assumptions for a spread).
A subnet could be trained to do the estimations of the best assumption for such a predictive engine, leading to a trainability for an expected spread entropy (a situational requirement of MUST and or ANY as GOOD) given a similarity measure of an output of training to a random network spread RND classifier. https://www.youtube.com/watch?v=z4lAlVRwbrc is an interview with an author on an interesting paper about AI exploration. This covers the RND idea in a use case. Training a post RND latent space map to merge lingual or other equivalent factorizations of the novelty could be part of this.
The reevaluation of situational state novelty then can become a post addition of a trained residual based on the expected future estimation and the purpose to which the predicted estimator is to be put. Imagine on a stage pretending or on a real battlefield. The eventual motor actions of production to have for benefit?
Considering a remarkable abstract datatype or two is possible, and perhaps closely models the human sequential thought process I wonder today what applications this will have when a suitable execution model ISA and microarchitecture have been defined. The properties of controllable locality of storage and motion, along with read and write along with branch on stimulus and other yet to be discovered machine operations make for a container for a kind of universal Turing machine.
Today is a good day for robot conciousness, although I wonder just how applicable the implementation model is for biological life all the universe over. Here’s a free paper on a condensed few months of abstract thought.
It’s not just about IT, but thrashing through what the mind does, can be made to do, did, it all leverages information and modeling simulation growth for matched or greater ability.
Yes, it could all be made in neural nets, but given the tools available why would you choose to stick with the complexity and lack of density of such a soulution? A reasoning accelerator would be cool for my PC. How is this going to come about without much worktop workshop? If it were just the oil market I could affect, and how did it come to pass that I was introduced to the fall of oil, and for what other consequential thought sets and hence productions I could change.
One might call it wonder and design dress in “accidental” wreckless endangerment. For what should be a simple obvious benefit to the world becomes embroiled in competition to the drive for profit for the control of the “others” making of a non happening which upsets vested interests.
Who’d have thought it from this little cul-de-sac of a planetary system. Not exactly galactic mainline. And the winner is not halting for a live mind.
It’s been a few weeks and I’ve been writing a document on AI and AGI which is currently internal and selective distributed. There is definitely a lot to try out including new network arrangements or layer types, and a fundamental insight of the Category Space Theorem and how it relates to training sets for categorization or classification AIs.
Basically, the category space is normally created to have only one network loss function option to minimise on backpropagation. It can be engineered so this is not true, and training data does not compete so much in a zero-sum game between categories. There is also some information context for an optimal order in categorization when using non-exact storage structures.
It’s a small book, with some bad typesetting, but getting information out is more important for a first edition. Feedback and sales are the best way for me to decide if and what to put in a second edition. It may be low on mathematical equations but does need an in-depth understanding of neural networks, and some computer science.
The product development starts soon, from the initials done over the last few weeks. An AI which has the aim of being more performant per unit cost. This is to be done by adding in “special functional units” optimized for effects that are better done by these instead of a pure neural network.
So apart from mildly funny AaaS selling jokes, this is a serious project initiative. The initial tests when available will compare the resources used to achieve a level of functional equivalence. In this regard, I am not expecting superlative leaps forward, although this would be nice, but gains in the general trend to AI for specific tasks to start.
By extending the already available sources (quite a few) with flexible licences, the building of easy to use AI with some modifications and perhaps extensions to open standards such as ONNX, and onto maybe VHDL FPGA and maybe ASIC.