VS Code and Elm

I’ve been looking into doing JS trans-compiled languages recently. The usual suspects popped up. ClosureScript, Elm, TypeScript, and maybe a few others. This had the unexpected effect of needing the VS Code software as some of the plugins do not yet work with VS Community. I opened up some TypeScript I had wrote recently, and found the way “require” is used for loading is not recognised in VS Code. Strange, and it might cause problems with passing on code to others.

I looked into Elm which is a Haskell for JS. It looks quite good, and I’ve downloaded the kit. I’ll let you know if I start using it big. Closure is Scheme or LISP almost. It seems to have little editor support compared to Elm, and I’d prefer to use Elm over Closure. I already have some libraries downloaded for functional extensions to JS, and some .d.ts descriptors too for some. The main reason I’ve never used Haskell is the large GHC binary size. The idea of using JS as the VM is good. It does however dump about 6000 lines of JS code for a hello world. I haven’t tested if this is per module. I understand Elm can do very fast HTML rendering though, so something to look into.

There’s also Haxe of course, and plenty of plain vanilla JS functional programming modules, including some like RQ, for threading control. Some nice Monad libraries, and good browser support. I also like the TinyMCE. It’s quite a classic. For the toolkit, Bootstrap.js is the current best with all the needed features of a modern looking site.

Beware much ado about category theory, and things like the continuation monad can do all sequential processing … of course from the context of writing it in a sequential language … blah, blah, stored state, pretend there’s none, blah, blah, monad, blah, delay output by wrapper, blah. Ok, well it is true I’m 46, but you young coders out there should take some of the symbology with quite a big pinch of salt, and maybe have a more interesting look at things like the Y combinator. It’s kind of what Mathematica would call Hold[] but with more monad blah for what is really group theory.

TypeScript in Visual Studio Community 2017

Just loaded up the VS 2017 community release. The TS 2.1 features include strong types relating to null and undefined. A bit annoying, but none the less it does force some decent console logging of salient error potential. I’ve found that many apparent error conditions are being removed. If only I could find a way to stop the coercion in JS, such that it is. It is one of the things I never liked about JS. I’d prefer an explicit cast every time.

General Update

An update on the current progress of projects and general things here at KRT. I’ve set about checking out TypeScript for using in projects. It looks good, has some hidden pitfalls on finding .m.ts files for underscore for example, but in general looks good. I’m running it over some JS to get more of a feel. The audio VST project is moving slowly, at oscillators at the moment, with filters being done. I am looking into cache coherence algorithms and strategies to ease hardware design at the moment too. The 68k2 document mentioned in previous post is expanding with some of these ideas in having a “stall on value match” register, with a “touch since changed” bit in each cache line.

All good.

The Processor Design Document in Progress

TypeScript

Well I eventually managed to get a file using _.reduce() to compile without errors now. I’ll test it as soon as I’ve adapted in QUnit 2.0.1 so I can write my tests to the build as a pop up window, an perhaps back load a file to then be able to save the file from within the editor, and hence to become parser frame.

Representation

An excerpt from the 68k2 document as it’s progressing. An idea on UTF8 easy indexing and expansion.

“Reducing the size of this indexing array can recursively use the same technique, as long as movement between length encodings is not traversed for long sequences. This would require adding in a 2 length (11 bit form) and a 3 length (16 bit form) of common punctuation and spacing. Surrogate pair just postpones the issue and moves cache occupation to 25%, and not quite that for speed efficiency. This is why the simplified Chinese is common circa 2017, and surrogate processing has been abandoned in the Unicode specification, and replaced by characters in the surrogate representation space. Hand drawing the surrogates was likely the issue, and character parts (as individual parts) with double strike was considered a better rendering option.

UTF8 therefore has a possible 17 bit rendering for due to the extra bit freed by not needing a UTF32 representation. Should this be glyph space, or skip code index space, or a mix? 16 bit purity says skip code space. With common length (2 bit) and count (14 bit), allowing skips of between 16 kB and 48 kB through a document. The 4th combination of length? Perhaps the representation of the common punctuation without character length alterations. For 512 specials in the 2 length form and 65536 specials in the 3 length forms. In UTF16 there would be issues of decode, and uniqueness. This perhaps is best tackled by some render form meta characters in the original Unicode space. There is no way around it, and with skips maybe UTF8 would be faster.”


// tool.js 1.1.1
// https://kring.co.uk
// (c) 2016-2017 Simon Jackson, K Ring Technologies Ltd
// MIT, like as he said. And underscored 😀

import * as _ from 'underscore';

//==============================================================================
// LZW-compress a string
//==============================================================================
// The bounce parameter if true adds extra entries for faster dictionary growth.
// Usually LZW dictionary grows sub linear on input chars, and it is of note
// that after a BWT, the phrase contains a good MTF estimate and so maybe fine
// to append each of its chars to many dictionary entries. In this way the
// growth of entries becomes "almost" linear. The dictionary memory foot print
// becomes quadratic. Short to medium inputs become even smaller. Long input
// lengths may become slightly larger on not using dictionary entries integrated
// over input length, but will most likely be slightly smaller.

// DO NOT USE bounce (=false) IF NO BWT BEFORE.
// Under these conditions many unused dictionary entries will be wasted on long
// highly redundant inputs. It is a feature for pre BWT packed PONs.
//===============================================================================
function encodeLZW(data: string, bounce: boolean): string {
var dict = {};
data = encodeSUTF(data);
var out = [];
var currChar;
var phrase = data[0];
var codeL = 0;
var code = 256;
for (var i=1; i<data.length; i++) {
currChar=data[i];
if (dict['_' + phrase + currChar] != null) {
phrase += currChar;
}
else {
out.push(codeL = phrase.length > 1 ? dict['_'+phrase] : phrase.charCodeAt(0));
if(code < 65536) {//limit
dict['_' + phrase + currChar] = code;
code++;
if(bounce && codeL != code - 2) {//code -- and one before would be last symbol out
_.each(phrase.split(''), function (chr) {
if(code < 65536) {
while(dict['_' + phrase + chr]) phrase += chr;
dict['_' + phrase + chr] = code;
code++;
}
});
}
}
phrase=currChar;
}
}
out.push(phrase.length > 1 ? dict['_'+phrase] : phrase.charCodeAt(0));
for (var i=0; i<out.length; i++) {
out[i] = String.fromCharCode(out[i]);
}
return out.join();
}

function encodeSUTF(s: string): string {
s = encodeUTF(s);
var out = [];
var msb: number = 0;
var two: boolean = false;
var first: boolean = true;
_.each(s, function(val) {
var k = val.charCodeAt(0);
if(k > 127) {
if (first == true) {
first = false;
two = (k & 32) == 0;
if (k == msb) return;
msb = k;
} else {
if (two == true) two = false;
else first = true;
}
}
out.push(String.fromCharCode(k));
});
return out.join();
}

function encodeBounce(s: string): string {
return encodeLZW(s, true);
}

//=================================================
// Decompress an LZW-encoded string
//=================================================
function decodeLZW(s: string, bounce: boolean): string {
var dict = {};
var dictI = {};
var data = (s + '').split('');
var currChar = data[0];
var oldPhrase = currChar;
var out = [currChar];
var code = 256;
var phrase;
for (var i=1; i<data.length; i++) {
var currCode = data[i].charCodeAt(0);
if (currCode < 256) {
phrase = data[i];
}
else {
phrase = dict['_'+currCode] ? dict['_'+currCode] : (oldPhrase + currChar);
}
out.push(phrase);
currChar = phrase.charAt(0);
if(code < 65536) {
dict['_'+code] = oldPhrase + currChar;
dictI['_' + oldPhrase + currChar] = code;
code++;
if(bounce && !dict['_'+currCode]) {//the special lag
_.each(oldPhrase.split(''), function (chr) {
if(code < 65536) {
while(dictI['_' + oldPhrase + chr]) oldPhrase += chr;
dict['_' + code] = oldPhrase + chr;
dictI['_' + oldPhrase + chr] = code;
code++;
}
});
}
}
oldPhrase = phrase;
}
return decodeSUTF(out.join(''));
}

function decodeSUTF(s: string): string {
var out = [];
var msb: number = 0;
var make: number = 0;
var from: number = 0;
_.each(s, function(val, idx) {
var k = val.charCodeAt(0);
if (k > 127) {
if (idx < from + make) return;
if ((k & 128) != 0) {
msb = k;
make = (k & 64) == 0 ? 2 : 3;
from = idx + 1;
} else {
from = idx;
}
out.push(String.fromCharCode(msb));
for (var i = from; i < from + make; i++) {
out.push(s[i]);
}
return;
} else {
out.push(String.fromCharCode(k));
}
});
return decodeUTF(out.join());
}

function decodeBounce(s: string): string {
return decodeLZW(s, true);
}

//=================================================
// UTF mangling with ArrayBuffer mappings
//=================================================
declare function escape(s: string): string;
declare function unescape(s: string): string;

function encodeUTF(s: string): string {
return unescape(encodeURIComponent(s));
}

function decodeUTF(s: string): string {
return decodeURIComponent(escape(s));
}

function toBuffer(str: string): ArrayBuffer {
var arr = encodeSUTF(str);
var buf = new ArrayBuffer(arr.length);
var bufView = new Uint8Array(buf);
for (var i = 0, arrLen = arr.length; i < arrLen; i++) {
bufView[i] = arr[i].charCodeAt(0);
}
return buf;
}

function fromBuffer(buf: ArrayBuffer): string {
var out: string = '';
var bufView = new Uint8Array(buf);
for (var i = 0, arrLen = bufView.length; i < arrLen; i++) {
out += String.fromCharCode(bufView[i]);
}
return decodeSUTF(out);
}

//===============================================
//A Burrows Wheeler Transform of strings
//===============================================
function encodeBWT(data: string): any {
var size = data.length;
var buff = data + data;
var idx = _.range(size).sort(function(x, y){
for (var i = 0; i < size; i++) {
var r = buff[x + i].charCodeAt(0) - buff[y + i].charCodeAt(0);
if (r !== 0) return r;
}
return 0;
});

var top: number;
var work = _.reduce(_.range(size), function(memo, k: number) {
var p = idx[k];
if (p === 0) top = k;
memo.push(buff[p + size - 1]);
return memo;
}, []).join('');

return { top: top, data: work };
}

function decodeBWT(top: number, data: string): string { //JSON

var size = data.length;
var idx = _.range(size).sort(function(x, y){
var c = data[x].charCodeAt(0) - data[y].charCodeAt(0);
if (c === 0) return x - y;
return c;
});

var p = idx[top];
return _.reduce(_.range(size), function(memo){
memo.push(data[p]);
p = idx[p];
return memo;
}, []).join('');
}

//==================================================
// Two functions to do a dictionary effectiveness
// split of what to compress. This has the effect
// of applying an effective dictionary size bigger
// than would otherwise be.
//==================================================
function tally(data: string): number[] {
return _.reduce(data.split(''), function (memo: number[], charAt: string): number[] {
memo[charAt.charCodeAt(0)]++;//increase
return memo;
}, []);
}

function splice(data: string): string[] {
var acc = 0;
var counts = tally(data);
return _.reduce(counts, function(memo, count: number, key) {
memo.push(key + data.substring(acc, count + acc));
/* adds a seek char:
This assists in DB seek performance as it's the ordering char for the lzw block */
acc += count;
}, []);
}

//=====================================================
// A packer and unpacker with good efficiency
//=====================================================
// These are the ones to call, and the rest sre maybe
// useful, but can be considered as foundations for
// these functions. some block length management is
// built in.
function pack(data: any): any {
//limits
var str = JSON.stringify(data);
var chain = {};
if(str.length > 524288) {
chain = pack(str.substring(524288));
str = str.substring(0, 524288);
}
var bwt = encodeBWT(str);
var mix = splice(bwt.data);

mix = _.map(mix, encodeBounce);
return {
top: bwt.top,
/* tally: encode_tally(tally), */
mix: mix,
chn: chain
};
}

function unpack(got: any): any {
var top: number = got.top || 0;
/* var tally = got.tally; */
var mix: string[] = got.mix || [];

mix = _.map(mix, decodeBounce);
var mixr: string = _.reduce(mix, function(memo: string, lzw: string): string {
/* var key = lzw.charAt(0);//get seek char */
memo += lzw.substring(1, lzw.length);//concat
return memo;
}, '');
var chain = got.chn;
var res = decodeBWT(top, mixr);
if(_.has(chain, 'chn')) {
res += unpack(chain.chn);
}
return JSON.parse(res);
}

 

Collections and Operator Overloading NOT in JS

Well, that was almost a disappointment for optimising ordered collection renders by using arrays. But I have an idea, and will keep you informed. You can check the colls.js as it evolves. I wont spoil the excitement. The class in the end should do almost everything an array can do, and more. Some features will be slightly altered as it’s a collection, and not just an an object with an array prototype daddy. The square brackets used to index arrays are out. I’m sure I’ve got a good work around. I’m sure I remember from an old Sun Microsystems book on JavaScript text indexed properties can do indexing in the array, but that was way back when “some random stuff is not an object” was a JS error message for just about everything.

One of these days I might make a mangler to output JS from a nicer, less coerced but more operator coerce-able language syntax. Although I have to say the way ECMAScript 6 is going, it’s a bit nuts. With pointless static and many other “features”. How about preventing JavaScript’s habit of just slinging un-var-ed variables into the global namespace without a corresponding var declaration in that scope? The arguments for and against are easy to throw a value to a test observer to debug, versus harder to find spelling errors in variable names at parse time. If the code was in flight while failing, then knowing the code will indicate where the fault lies. For other people’s code a line number or search string would be better. Either works for me.

My favourite mind mess would be { .[“something”]; anything; } for dynamic tag based on the value of something in object expressions. Giggle.

Spoiler Alert

Yes, I’ve decided to make the base array a set of ordered keys, based on an ordering set of key names, so that many operations can be optimised by binary divide and conquer. For transparent access a Proxy object support in the browser will be required. I may provide put and get methods for older situations, and also because that would allow for a multi keyed index. The primary key based on all supplied initial keys and their compare functions to use and the priority order [a, b, c, …], and automatic secondary keys of [b, c, …], [c, …], […], … for when there needs to be so, I’ll start the rewrite soon. Of course the higher operations like split and splice won’t be available on the auto secondary keys, but in years of database design I’ve never could have not been one of such form.

The filt.js script will then extend utility by allowing any of the auto secondaries to be treated as a primary on the filter view, and specification of an equals, or a min and max range. All will share the common hidden array of objects in a particular collection for space efficiency reasons. This should make a medium fast local database structure possible, with reasonable scaling. Today and tomorrow though will be spent on a meeting, and effective partitioning strategy to avoid a “full table pull requests” to the server.

Further Improvements

The JSON encoded collation order was chosen to prevent bad comparisons between objects with silly string representations. It might be extended, such that a generic text search, and object key ordering are given some possibility. This is perhaps another use the compression can be put as BWT in the __ module has good search characteristics. Something to think over. It looks as though the code would be slow depending on heavy use of splice. This does suggest an optimization by making another Array subclass named SpliceArray which uses an n-tree with sub element and leaf count and cumulative tally, for O(1) splice performance.

WordPress. What’s it like?

I’ve been using WordPress for a while now, and it’s OK. The admin interface can take a little to get used to, especially when plugins throw menus all over the place, but the online help is very good. The main issues recently were with configuring an email sendmail system so that WordPress could send emails. Upgrading can also be a bit of a pain too, The main issue at the moment is building a dynamic web app. The option to edit WordPress PHP is a no, on the grounds of updates of WordPress source overwrite changes. Creation of a WordPress plugin is also an option, but was not chosen as I want client rendering, not server rendering in the app, to keep the server loading low. I see WordPress as a convenience, not a perfection. So the decision was taken to use client side JavaScript rendering, and have one single PHP script (in the WordPress root folder) which supplies JSON from extra tables created in the WordPress database. An eventual plugin for WordPress may be possible to install this script, and a few JavaScript files to bootstrap the client engine, but not at this time.

This way of working also means the code can be WordPress transparent, and be used for other site types, and an easier one script conversion for node.js for example. With an install base of over 70 million, and an easy templating CMS, WordPress is a good, but pragmatic choice for this site so far. The other main decisions then all related to the client JavaScript stack. I decided to go for riot as the templating engine as it is lightweight and keeps things modular. Some say ReactJS is good, and it does look it, but I found riot.js which looks just as good, is smaller as an include (Have you seen the page source of a WordPress page?) and has client side rendering easily. And then looking at either underscore.js or lodash.js (I picked underscore), for a basic utility. The next up is the AJAX layer. While WordPress does include jQuery, for independence from WordPress tie ins, this ruled out the OK backbone.js and also a fully custom layer allows me to experiment with bandwidth reduction using data compression as a research opportunity. So I have laid out a collections architecture for myself.

Connecting riot.js to this custom layer should be effectively easy. The only other issue was then a matter of style sheet processing to enhance consistency of style. The excellent less.js was chosen for this. Even though client side rendering was also chosen, which is sub optimal (cached, but uses time, but also allows the possibility and later opportunity of meta manipulation of say colour values as sets, for CSS design compression), but does have freedom from tie in to a particular back end solution (a single PHP script at present). So that becomes the stack in its entire form minus the application. Well I can’t wright everything at once, and the end user experience means the application form must be finalized last, as possibility only remains so this side of implementation. For the record I also consider the collections layer a research opportunity. I’ve seen a good few technologies fail to be anything but good demos, and some which should have remained demos but had good funding. Ye old time to market, and sell a better one later for double the sales. Why not buy one quality one later?

The Inquiry Forms Emailing Now Works

Just set up sendmail on the server using these instructions. Now it’s possible to send inquiries. There will also be other forms as and when relevant. The configuration had an extra step of using a cloud file which behaves as a master for sendmail.mc, but this was no big deal. It all took less than ten minutes. A minor complexity is now to forward the response email to somewhere useful to tie up that loose end for later.

RiotEmbed.js Coding Going Well

I’ve done more on riotEmbed today. Developed a system for hash code checking any scripts dynamically loaded. This should stop injection of just any code. There is also some removal of semantic and syntax errors based on ‘this’ and ‘call’, and some confusion between JSON, and JS which can contain roughly JSON, where ‘x: function()’ and ‘function x()’ are not quite the same.

I’ll do some planing of DB schema tomorrow …

The following link is a QUnit testing file I set up, which does no real tests yet, but is good for browser code testing, and much easier than the convoluted Travis CI virtual machine excess.

QUnit Testing

I’ve added in a dictionary acceleration method to the LZW, and called it PON (Packed Object Notation), which is only really effective when used after a BWT as in the pack method. Also some Unicode compression was added, which users of local 64 character sets will like. This leaves a final point in the compression layer of UCS-2 to UTF-8 conversion at the XmlHttpRequest boundary. By default this uses a text interface, and so the 16 bit characters native to JavaScript strings, are UTF-8 encoded and decoded at the eventual net octet streaming. As the PON is expected to be large (when compression is really needed) compared to any other uncompressed JSON, there is an argument to serialize for high dictionary codes (doubling of uncompressed JSON size, and almost a third of PON size), or post UTF to apply SUTF coding (cutting one third off compressed PON without affecting uncompressed JSON). The disadvantage to this is on the server side. The PON is the same, but the uncompressed JSON part will need an encoding and decoding function pair, and hence consume computational and memory buffer resources on the server.

As the aim of this compression layer is to remove load off the server, this is something to think about. The PON itself will not need that encoding taking off or putting on. As much of the uncompressed JSON part will be for SQL where clauses, and as indexes for arrays, the server can be considered ignorant of SUTF, as long as all literals used in the PHP script are ASCII. This is likely for all JSON keys, and literal values. So the second option on the client side of SUTF of the UTF would be effective. Some would say put Gzip on the server, but that would be more server load, which is to be avoided for scaling. I wonder if SUTF was written to accept use of the UInt8Array type?

Some hindsight analysis shows the one third gain is not likely to be realized except with highly redundant data. More realistic data has a wider than 64 dictionary code spread, and the middle byte of a UTF-8 is the easiest one to drop on repeats. The first byte contains the length indication, and as such the code would become much more complex to drop the high page bits, by juggling the lower four bits (0 to 3) in byte two for the lower four bits of byte one. Possibly a self inverse function … Implemented (no testing yet). The exact nature of JSON input to the pack function is next on the list client side, with the corresponding server side requirements of maintaining a searchable store, and distribution replication consistency.

The code spread is now 1024 symbols (maintaining easy decode and ASCII preservation), as anything larger would affect bits four and five in byte two and change the one third saving on three byte UTF-8 code points. There are 2048 dictionary codes before this compression is even used, and so only applies for larger inputs. As the dictionary codes are slightly super linear, I did have an idea to normalize them by subtracting a linear growth overtime, and then “invert the negative bulges” where lower and hence shorter dictionary codes were abnormal to the code growth trend. This is not applicable though as not enough information is easily available in a compressed stream to recover the coding. Well at least it gives something for gzip to have a crack at for those who want to burden the server.

Putting riot.js and less.js on WordPress

You’ll need the insert headers and footers plugin and then put the following in the footer. As the mount is done before the page bottom, there maybe problems with some aspects. In any pages or postings it then becomes possible to incorporate your own tag objects, and refer to them in a type definition. As the service of such “types” has to be free of certain HTML concerns, it’s likely a good idea to set up a github repo to store your types. CSS via less is in the compiler.

<script src="https://rawgit.com/jackokring/riot-embed/master/less.min.js"></script>
<script src="https://rawgit.com/jackokring/riot-embed/master/riot%2Bcompiler.min.js"></script>
<script src="https://rawgit.com/jackokring/riot-embed/master/underscore.min.js"></script>
<script>
  riot.mount('*');
</script>

Below should appear a timer … The visual editor can make your custom tags disappear.

Produced by the following …

<timer></timer>
<script src="hhttps://rawgit.com/jackokring/riot-embed/master/timer.tag" type="riot/tag"></script>

Or with the shortcoder plugin this becomes … in any post …

[sc name="riot" tag="timer"]

And … once in the shortcoder editor

<%%tag%%></%%tag%%>
<script src="https://rawgit.com/jackokring/riot-embed/master/%%tag%%.tag" type="riot/tag"></script>

This is then the decided dynamic content system, and the back end, and script services are being developed on the following GitHub repository. The project scope is describe there. Client side where possible, and server side for database replication consistency. The tunnel to the database will be the only PHP, and all page returns for ajax style stuff will be as JavaScript returns, so that no static json is sent.

GitHub Repo for Ideas

Next up storing PON …

Models, Views and Controllers

Had an interesting meetup about Android wear. All good, nice introduction to the new Complexity API, and general revision on optimization for power and other resources. It did get me thinking about MVC design patterns, and how I’d love the implementation to be. Model instantiate, model update, a model save callable, a view renderer, and a control list builder from method parameter signature. Very minimalist.

Initial thoughts indicate that the ordering of the controller events should be done based on the order of method occurrence in the class file for say Java. Also supplying a store state to the instantiate, and as a parameter to the store callback would be good. Like a Bundle. The renderer just needs a graphics context, and the controller callbacks need a unique method name (with some printed name mangling), and a boolean parameter to either do, or request doable for a grey out, and returning a boolean done, or doable status.

Synchronized on thread can keep mutual criticals from interfering, and a utility set of methods such as useIcon on the doable control query path could set up the right icon within a list with icons. This is the simplest MVC factorization I can envisage, and there are always more complex ones, although simpler ones would be hard. For post-process after loaded, or pre-process before saving, there would be no essential requirement to sub factor, although overrides of methods in a higher level object MVC may benefit from such.

Method name mangling should really be combined with internationalization, which makes for at least one existential method name printed form and language triad. This sort of thing should not be XML, it should be a database table. Even the alternate doable test path could be auto generated via extra fields in this table and some extra generation code, to keep the “code” within the “list field” of the editor. I’ve never seen a text editor backed by a database. Perhaps it could help in a “pivot table/form” way.

Just to say, I do like languages with a ‘ (quote) or “setq” or “Hold” method of supplying un-evaluated closures as function or method arguments. JavaScript has anonymous functions. Yep, okie dokie.

The Build

Java development is under way with some general extension of Kodek and an outline of some classes to make a generalized Map structure. The corporate bank account is almost opened, and things in general are slow but well. There are new product development ideas and some surprising service options not in computing directly.

The directors birthday went off without too many problems. Much beer was consumed, and the UK weather decided to come play summer. Lovely. If you have any paying contracts in computer software, do inquire. Guaranteed is in depth analysis, technical appraisal, scientific understanding, a choice between fast done and well structured code, and above all humour unless paid (in advance) otherwise not to let flow from the fountain of chuckle.