Collections and Operator Overloading NOT in JS

Well, that was almost a disappointment for optimising ordered collection renders by using arrays. But I have an idea, and will keep you informed. You can check the colls.js as it evolves. I wont spoil the excitement. The class in the end should do almost everything an array can do, and more. Some features will be slightly altered as it’s a collection, and not just an an object with an array prototype daddy. The square brackets used to index arrays are out. I’m sure I’ve got a good work around. I’m sure I remember from an old Sun Microsystems book on JavaScript text indexed properties can do indexing in the array, but that was way back when “some random stuff is not an object” was a JS error message for just about everything.

One of these days I might make a mangler to output JS from a nicer, less coerced but more operator coerce-able language syntax. Although I have to say the way ECMAScript 6 is going, it’s a bit nuts. With pointless static and many other “features”. How about preventing JavaScript’s habit of just slinging un-var-ed variables into the global namespace without a corresponding var declaration in that scope? The arguments for and against are easy to throw a value to a test observer to debug, versus harder to find spelling errors in variable names at parse time. If the code was in flight while failing, then knowing the code will indicate where the fault lies. For other people’s code a line number or search string would be better. Either works for me.

My favourite mind mess would be { .[“something”]; anything; } for dynamic tag based on the value of something in object expressions. Giggle.

Spoiler Alert

Yes, I’ve decided to make the base array a set of ordered keys, based on an ordering set of key names, so that many operations can be optimised by binary divide and conquer. For transparent access a Proxy object support in the browser will be required. I may provide put and get methods for older situations, and also because that would allow for a multi keyed index. The primary key based on all supplied initial keys and their compare functions to use and the priority order [a, b, c, …], and automatic secondary keys of [b, c, …], [c, …], […], … for when there needs to be so, I’ll start the rewrite soon. Of course the higher operations like split and splice won’t be available on the auto secondary keys, but in years of database design I’ve never could have not been one of such form.

The filt.js script will then extend utility by allowing any of the auto secondaries to be treated as a primary on the filter view, and specification of an equals, or a min and max range. All will share the common hidden array of objects in a particular collection for space efficiency reasons. This should make a medium fast local database structure possible, with reasonable scaling. Today and tomorrow though will be spent on a meeting, and effective partitioning strategy to avoid a “full table pull requests” to the server.

Further Improvements

The JSON encoded collation order was chosen to prevent bad comparisons between objects with silly string representations. It might be extended, such that a generic text search, and object key ordering are given some possibility. This is perhaps another use the compression can be put as BWT in the __ module has good search characteristics. Something to think over. It looks as though the code would be slow depending on heavy use of splice. This does suggest an optimization by making another Array subclass named SpliceArray which uses an n-tree with sub element and leaf count and cumulative tally, for O(1) splice performance.

RiotEmbed.js Coding Going Well

I’ve done more on riotEmbed today. Developed a system for hash code checking any scripts dynamically loaded. This should stop injection of just any code. There is also some removal of semantic and syntax errors based on ‘this’ and ‘call’, and some confusion between JSON, and JS which can contain roughly JSON, where ‘x: function()’ and ‘function x()’ are not quite the same.

I’ll do some planing of DB schema tomorrow …

The following link is a QUnit testing file I set up, which does no real tests yet, but is good for browser code testing, and much easier than the convoluted Travis CI virtual machine excess.

QUnit Testing

I’ve added in a dictionary acceleration method to the LZW, and called it PON (Packed Object Notation), which is only really effective when used after a BWT as in the pack method. Also some Unicode compression was added, which users of local 64 character sets will like. This leaves a final point in the compression layer of UCS-2 to UTF-8 conversion at the XmlHttpRequest boundary. By default this uses a text interface, and so the 16 bit characters native to JavaScript strings, are UTF-8 encoded and decoded at the eventual net octet streaming. As the PON is expected to be large (when compression is really needed) compared to any other uncompressed JSON, there is an argument to serialize for high dictionary codes (doubling of uncompressed JSON size, and almost a third of PON size), or post UTF to apply SUTF coding (cutting one third off compressed PON without affecting uncompressed JSON). The disadvantage to this is on the server side. The PON is the same, but the uncompressed JSON part will need an encoding and decoding function pair, and hence consume computational and memory buffer resources on the server.

As the aim of this compression layer is to remove load off the server, this is something to think about. The PON itself will not need that encoding taking off or putting on. As much of the uncompressed JSON part will be for SQL where clauses, and as indexes for arrays, the server can be considered ignorant of SUTF, as long as all literals used in the PHP script are ASCII. This is likely for all JSON keys, and literal values. So the second option on the client side of SUTF of the UTF would be effective. Some would say put Gzip on the server, but that would be more server load, which is to be avoided for scaling. I wonder if SUTF was written to accept use of the UInt8Array type?

Some hindsight analysis shows the one third gain is not likely to be realized except with highly redundant data. More realistic data has a wider than 64 dictionary code spread, and the middle byte of a UTF-8 is the easiest one to drop on repeats. The first byte contains the length indication, and as such the code would become much more complex to drop the high page bits, by juggling the lower four bits (0 to 3) in byte two for the lower four bits of byte one. Possibly a self inverse function … Implemented (no testing yet). The exact nature of JSON input to the pack function is next on the list client side, with the corresponding server side requirements of maintaining a searchable store, and distribution replication consistency.

The code spread is now 1024 symbols (maintaining easy decode and ASCII preservation), as anything larger would affect bits four and five in byte two and change the one third saving on three byte UTF-8 code points. There are 2048 dictionary codes before this compression is even used, and so only applies for larger inputs. As the dictionary codes are slightly super linear, I did have an idea to normalize them by subtracting a linear growth overtime, and then “invert the negative bulges” where lower and hence shorter dictionary codes were abnormal to the code growth trend. This is not applicable though as not enough information is easily available in a compressed stream to recover the coding. Well at least it gives something for gzip to have a crack at for those who want to burden the server.

A Server Side Java Jetty Persistent Socket API is in Development

I looked into various available solutions, but for full back end customization I have decided on a persistent socket layer of my own design. The Firebase FCM module supplies the URL push for pull connections (Android client side), and an excellent SA-IS library class under MIT licence is used to provide FilterStream compression (BWT with contextual LZW). The whole thing is Externalizable, by design, and so far looking better than any solution available for free. Today is to put in more thought on the scalability side of things, as this will be difficult to rectify later.

Finding out how to make a JavaEE/Jetty Servlet project extension in Android Studio was useful, and I’d suggest the Embedded Jetty to anyone, and the Servlet API is quite a tiny part of the full jetty download. It looks like the back end becomes a three Servlet site, and some background tasks built on the persistent streams. Maybe some extension later, but so far even customer details don’t need to be stored.

The top level JSONAsync object supports keepUpdated() and clone() with updateTo(JSONObject) for backgrounded two directional compressed sync with off air and IP number independent functionality. The clone() returns a new JSONObject so allowing edits before updateTo(). The main method of detecting problems is errors in decoding the compressed stream. The code detects this, and requests a flush to reinitialize the compression dictionary. This capture of IOException with Thread looping and yield(), provides for re-establishment of the connection.

The method updateTo() is rate regulated, and may not succeed in performing an immediate update. The local copy is updated, and any remote updates can be joined with further updateTo() calls. A default thread will attempt a 30 second synchronization tick, if there is not one in progress. The server also checks for making things available to the client every 30 seconds, but this will not trigger a reset.

The method keepUpdated() is automatically rate regulated. The refresh interval holds off starting new refreshes until the current refresh is completed or failed. Refreshing is attempted as fast as necessary, but if errors start occurring, the requests to the server are slowed down.

The method trimSync() removes non active channels in any context where a certainty of connectivity is known. This is to prevent memory leaks. The automatic launching of a ClientProcessor when a new client FCM idToken is received, looks nice, with restoration of the socket layer killing ones which are not unique. The control flow can be activated and code in the flow must be written such that no race condition exists, such as performing two wrights. A process boot lock until the first control flow activator provides for sufficient guard against this given otherwise sequential dependency of and on a set of JSONAsync objects.