A Modified ElGamal for Passwords Only

It occurred to me g does not need to be made public for ElGamal signing, if the value g^H(m) is stored as the password hash, generated by the client. Also (r, s) can be changed to (r, r^s) to reduce server verification load to one mod power and one precision multiply mod p, and a subtraction equality test. So on the creation of a new password (y, p, g^H(m)) is created, and each log in needs the client to generate a k value to make (r, r^s).

Password recovery would be a little complex, and involve some email backdoor based on maybe using x as a pseudo H(m), and verifying the changes via generation of y. This would of course only set the local browser to have a new password. So maybe a unique (y, p, g^H(m)) per browser local store used. Index the local storage via email address, and Bob’s yer been here before.

Also, the server can crypt any pending view using H(m) as a person’s private key, or the private key as a browser specific personal private key, or maybe even browser key with all clients using same local store x value. All using DH shared secrets. This keeps data in a database a bit more private, and sometimes encrypt to self might be useful.

Is s=H(m)(1-r)(k^-1) mod (p-1) an option? As this sets H(m)=x, eliminating another y, making (p, g^H(m)) sufficient for authentication server storage, and g is only needed if the server needs to send crypts. Along with r=g^k mod p, as some easy sign. (r, s) might have to be used, as r^s could be equated as modinverse(r) for an easy g^H(m) equality, and the requirement to calculate s from r^s is a challenge. So a secure version is not quite as server efficient.

In reality k also has to be computed to prevent (r, s) reuse. This requires the k choice is the servers. Sending k in plaintext defeats the security, so g is needed, to calculate g^z, and so g^(H(m))^z=k on both sides. A retry randomizer to hide s=0, and a protocol is possible.

This surpasses a server md5 of the password. If the md5 is client side, a server capture can log in. If the md5 is server side, the transit intercept is … but a server DB compromise also needs a web server compromise. This algorithm also needs a client side compromise, or email intercept as per.

The reuse of (r, s) can’t be prevented without knowing k, and hence H(m), therefore a shared secret as a returned value implies H(m) knowledge. So one mod power client side, and two server side.

g^k to client.
(g^k)^H(m) to server.
(g^H(m))^k = (g^k)^H(m) tests true.

Signatures are useless as challenge responses. The RSA version would have to involve a signature on H(m) and so need H(m) direct. Also, the function H can be quite interesting to study. The application of client side salt also is not needed on the server side as a decode key, and so not decoded there. DH is so cool like that. And (p-1) having a large factor is easy to arrange in the key generation. And write access is harder, most of the time, to obtain for data.

The storing of a crypt with the g^k used, locks it for H(m) keyed access. This could void data on a password reset, or a browser local storage reset, but does prevent some client’s data leak opertunities, such as DB decrypt keys. This would have multiple crypts of the symmetric key for shared data, but would this significantly reduce the shared key security? It would prevent new users accessing the said secured data without cracking the shared key. A locked share for private threads say?

Spamming your friends with g^salt and g^salt^H(m)?

The first one is a good idea, the second not so much. AI spam encoding g^salt to your and friends accounts. The critical thing is the friend doesn’t get the password. Assuming a bad friend, who registers and gets g^salt to activate, from their own chosen spoof password. An email does get sent to your email, to cancel the friend as an option, and no other problem exists excepting login to a primary mail account. As a spoof maybe would see the option to remove you from your own account.

The primary control email account would then need secondary authentication. Such as only see the spam folder, and know what to open first and in order. For password recovery, this would be ok. For initial registration, it would be first come first served anyhow.

The Cloud Project

So far I’m up to 5 classes left to fill in

  • SignedPublicKey
  • Server
  • Keys
  • AuditInputStream
  • ScriptOutputStream

They are closely coupled in the package. The main reason for defining a new SignedPublicKey class is that the current CA system doesn’t have sufficient flexibility for the project. The situation with tunnel proxies has yet to be decided. At present the reverse proxy tunnel over a firewall ia based on overiding DNS at the firewall, to route inwards and not having the self as the IP for the host address. Proxy rights will of course be certificate based, and client to client link layer specific.

UPDATE: Server has been completed, and now the focus is on SignedPublicKey for the load/save file access restrictions. The sign8ng process also has to be worked out to allow easy use. There is also some consideration for a second layer of encryption over proxy connection links, and some decisions to be made on the server script style.

The next idea would be a client specific protocol. So instead of server addresses, there would be a client based protocol addressing string. kring.co.uk/file is a server domain based address. This perhaps needs extending.

General Update

An update on the current progress of projects and general things here at KRT. I’ve set about checking out TypeScript for using in projects. It looks good, has some hidden pitfalls on findingĀ .m.ts files for underscore for example, but in general looks good. I’m running it over some JS to get more of a feel. The audio VST project is moving slowly, at oscillators at the moment, with filters being done. I am looking into cache coherence algorithms and strategies to ease hardware design at the moment too. The 68k2 document mentioned in previous post is expanding with some of these ideas in having a “stall on value match” register, with a “touch since changed” bit in each cache line.

All good.

The Processor Design Document in Progress

TypeScript

Well I eventually managed to get a file using _.reduce() to compile without errors now. I’ll test it as soon as I’ve adapted in QUnit 2.0.1 so I can write my tests to the build as a pop up window, an perhaps back load a file to then be able to save the file from within the editor, and hence to become parser frame.

Representation

An excerpt from the 68k2 document as it’s progressing. An idea on UTF8 easy indexing and expansion.

“Reducing the size of this indexing array can recursively use the same technique, as long as movement between length encodings is not traversed for long sequences. This would require adding in a 2 length (11 bit form) and a 3 length (16 bit form) of common punctuation and spacing. Surrogate pair just postpones the issue and moves cache occupation to 25%, and not quite that for speed efficiency. This is why the simplified Chinese is common circa 2017, and surrogate processing has been abandoned in the Unicode specification, and replaced by characters in the surrogate representation space. Hand drawing the surrogates was likely the issue, and character parts (as individual parts) with double strike was considered a better rendering option.

UTF8 therefore has a possible 17 bit rendering for due to the extra bit freed by not needing a UTF32 representation. Should this be glyph space, or skip code index space, or a mix? 16 bit purity says skip code space. With common length (2 bit) and count (14 bit), allowing skips of between 16 kB and 48 kB through a document. The 4th combination of length? Perhaps the representation of the common punctuation without character length alterations. For 512 specials in the 2 length form and 65536 specials in the 3 length forms. In UTF16 there would be issues of decode, and uniqueness. This perhaps is best tackled by some render form meta characters in the original Unicode space. There is no way around it, and with skips maybe UTF8 would be faster.”


// tool.js 1.1.1
// https://kring.co.uk
// (c) 2016-2017 Simon Jackson, K Ring Technologies Ltd
// MIT, like as he said. And underscored :D

import * as _ from 'underscore';

//==============================================================================
// LZW-compress a string
//==============================================================================
// The bounce parameter if true adds extra entries for faster dictionary growth.
// Usually LZW dictionary grows sub linear on input chars, and it is of note
// that after a BWT, the phrase contains a good MTF estimate and so maybe fine
// to append each of its chars to many dictionary entries. In this way the
// growth of entries becomes "almost" linear. The dictionary memory foot print
// becomes quadratic. Short to medium inputs become even smaller. Long input
// lengths may become slightly larger on not using dictionary entries integrated
// over input length, but will most likely be slightly smaller.

// DO NOT USE bounce (=false) IF NO BWT BEFORE.
// Under these conditions many unused dictionary entries will be wasted on long
// highly redundant inputs. It is a feature for pre BWT packed PONs.
//===============================================================================
function encodeLZW(data: string, bounce: boolean): string {
var dict = {};
data = encodeSUTF(data);
var out = [];
var currChar;
var phrase = data[0];
var codeL = 0;
var code = 256;
for (var i=1; i<data.length; i++) {
currChar=data[i];
if (dict['_' + phrase + currChar] != null) {
phrase += currChar;
}
else {
out.push(codeL = phrase.length > 1 ? dict['_'+phrase] : phrase.charCodeAt(0));
if(code < 65536) {//limit
dict['_' + phrase + currChar] = code;
code++;
if(bounce && codeL != code - 2) {//code -- and one before would be last symbol out
_.each(phrase.split(''), function (chr) {
if(code < 65536) {
while(dict['_' + phrase + chr]) phrase += chr;
dict['_' + phrase + chr] = code;
code++;
}
});
}
}
phrase=currChar;
}
}
out.push(phrase.length > 1 ? dict['_'+phrase] : phrase.charCodeAt(0));
for (var i=0; i<out.length; i++) {
out[i] = String.fromCharCode(out[i]);
}
return out.join();
}

function encodeSUTF(s: string): string {
s = encodeUTF(s);
var out = [];
var msb: number = 0;
var two: boolean = false;
var first: boolean = true;
_.each(s, function(val) {
var k = val.charCodeAt(0);
if(k > 127) {
if (first == true) {
first = false;
two = (k & 32) == 0;
if (k == msb) return;
msb = k;
} else {
if (two == true) two = false;
else first = true;
}
}
out.push(String.fromCharCode(k));
});
return out.join();
}

function encodeBounce(s: string): string {
return encodeLZW(s, true);
}

//=================================================
// Decompress an LZW-encoded string
//=================================================
function decodeLZW(s: string, bounce: boolean): string {
var dict = {};
var dictI = {};
var data = (s + '').split('');
var currChar = data[0];
var oldPhrase = currChar;
var out = [currChar];
var code = 256;
var phrase;
for (var i=1; i<data.length; i++) {
var currCode = data[i].charCodeAt(0);
if (currCode < 256) {
phrase = data[i];
}
else {
phrase = dict['_'+currCode] ? dict['_'+currCode] : (oldPhrase + currChar);
}
out.push(phrase);
currChar = phrase.charAt(0);
if(code < 65536) {
dict['_'+code] = oldPhrase + currChar;
dictI['_' + oldPhrase + currChar] = code;
code++;
if(bounce && !dict['_'+currCode]) {//the special lag
_.each(oldPhrase.split(''), function (chr) {
if(code < 65536) {
while(dictI['_' + oldPhrase + chr]) oldPhrase += chr;
dict['_' + code] = oldPhrase + chr;
dictI['_' + oldPhrase + chr] = code;
code++;
}
});
}
}
oldPhrase = phrase;
}
return decodeSUTF(out.join(''));
}

function decodeSUTF(s: string): string {
var out = [];
var msb: number = 0;
var make: number = 0;
var from: number = 0;
_.each(s, function(val, idx) {
var k = val.charCodeAt(0);
if (k > 127) {
if (idx < from + make) return;
if ((k & 128) != 0) {
msb = k;
make = (k & 64) == 0 ? 2 : 3;
from = idx + 1;
} else {
from = idx;
}
out.push(String.fromCharCode(msb));
for (var i = from; i < from + make; i++) {
out.push(s[i]);
}
return;
} else {
out.push(String.fromCharCode(k));
}
});
return decodeUTF(out.join());
}

function decodeBounce(s: string): string {
return decodeLZW(s, true);
}

//=================================================
// UTF mangling with ArrayBuffer mappings
//=================================================
declare function escape(s: string): string;
declare function unescape(s: string): string;

function encodeUTF(s: string): string {
return unescape(encodeURIComponent(s));
}

function decodeUTF(s: string): string {
return decodeURIComponent(escape(s));
}

function toBuffer(str: string): ArrayBuffer {
var arr = encodeSUTF(str);
var buf = new ArrayBuffer(arr.length);
var bufView = new Uint8Array(buf);
for (var i = 0, arrLen = arr.length; i < arrLen; i++) {
bufView[i] = arr[i].charCodeAt(0);
}
return buf;
}

function fromBuffer(buf: ArrayBuffer): string {
var out: string = '';
var bufView = new Uint8Array(buf);
for (var i = 0, arrLen = bufView.length; i < arrLen; i++) {
out += String.fromCharCode(bufView[i]);
}
return decodeSUTF(out);
}

//===============================================
//A Burrows Wheeler Transform of strings
//===============================================
function encodeBWT(data: string): any {
var size = data.length;
var buff = data + data;
var idx = _.range(size).sort(function(x, y){
for (var i = 0; i < size; i++) {
var r = buff[x + i].charCodeAt(0) - buff[y + i].charCodeAt(0);
if (r !== 0) return r;
}
return 0;
});

var top: number;
var work = _.reduce(_.range(size), function(memo, k: number) {
var p = idx[k];
if (p === 0) top = k;
memo.push(buff[p + size - 1]);
return memo;
}, []).join('');

return { top: top, data: work };
}

function decodeBWT(top: number, data: string): string { //JSON

var size = data.length;
var idx = _.range(size).sort(function(x, y){
var c = data[x].charCodeAt(0) - data[y].charCodeAt(0);
if (c === 0) return x - y;
return c;
});

var p = idx[top];
return _.reduce(_.range(size), function(memo){
memo.push(data[p]);
p = idx[p];
return memo;
}, []).join('');
}

//==================================================
// Two functions to do a dictionary effectiveness
// split of what to compress. This has the effect
// of applying an effective dictionary size bigger
// than would otherwise be.
//==================================================
function tally(data: string): number[] {
return _.reduce(data.split(''), function (memo: number[], charAt: string): number[] {
memo[charAt.charCodeAt(0)]++;//increase
return memo;
}, []);
}

function splice(data: string): string[] {
var acc = 0;
var counts = tally(data);
return _.reduce(counts, function(memo, count: number, key) {
memo.push(key + data.substring(acc, count + acc));
/* adds a seek char:
This assists in DB seek performance as it's the ordering char for the lzw block */
acc += count;
}, []);
}

//=====================================================
// A packer and unpacker with good efficiency
//=====================================================
// These are the ones to call, and the rest sre maybe
// useful, but can be considered as foundations for
// these functions. some block length management is
// built in.
function pack(data: any): any {
//limits
var str = JSON.stringify(data);
var chain = {};
if(str.length > 524288) {
chain = pack(str.substring(524288));
str = str.substring(0, 524288);
}
var bwt = encodeBWT(str);
var mix = splice(bwt.data);

mix = _.map(mix, encodeBounce);
return {
top: bwt.top,
/* tally: encode_tally(tally), */
mix: mix,
chn: chain
};
}

function unpack(got: any): any {
var top: number = got.top || 0;
/* var tally = got.tally; */
var mix: string[] = got.mix || [];

mix = _.map(mix, decodeBounce);
var mixr: string = _.reduce(mix, function(memo: string, lzw: string): string {
/* var key = lzw.charAt(0);//get seek char */
memo += lzw.substring(1, lzw.length);//concat
return memo;
}, '');
var chain = got.chn;
var res = decodeBWT(top, mixr);
if(_.has(chain, 'chn')) {
res += unpack(chain.chn);
}
return JSON.parse(res);
}

 

LZW (Perhaps with Dictionary Acceleration) Dictionaries in O(m) Memory

Referring to a previous hybrid BWT/LZW compression method I have devised, the dictionary of the LZW can be stored in chain linked fixed size structure arrays one character (the symbol end) back linking to the first character through a chain. This makes efficient symbol indexing based on number, and with the slight addition of two extra pointers, a set of B-trees can be built separated by symbol length to also be loaded in inside parallel arrays for fast incremental finding of the existence of a symbol. A 16 bucket move to front hash table could also be used instead of a B-tree, depending on the trade off between memory of a 2 pointer B-tree, or a 1 pointer MTF collision hash chain.

On the nature of the BWT size, and the efficiency. Using the same LZW dictionary across multiple BWT blocks with the same suffix start character is effective with a minor edge effect, rapidly reducing in percentage as the block size increases. An interleave reordering such that the suffix start character is the primary group by of linearity, assists in the scan for serachability. The fact that a search can be rephrased as a join on various character pairings, the minimal character pair can be scanned up first, and “joined” to the end of the searched for string, and then joined to the beginning in a reverse search, to then pull all the matches sequentially.

Finding the suffixes in the LZW structure is relatively easy to produce symbol codes, to find the associated set of prefixes and infixes is a little more complex. A mostly constant search string can be effectively compiled and searched. A suitable secondary index extension mapping symbol sequences to “atomic” character sequences can be constructed to assist in the transform of characters to symbol dictionary index code tuples. This is a second level table in effect, which can be also compressed for atom specific search optimization without the LZW dictionary loading without find.

The fact the BWT infers an all matches sequential nature, and a second level of BWT with the dictionary index codes as the alphabet could defiantly reduce the needed scan time for finding each LZW symbol index sequence. Perhaps a unified B-tree as well as the length specific B-tree within the LZW dictionary would be useful for greater and less than constraints.

As the index can become a self index, there maybe a need to represent a row number along side the entry. Multi column indexes, or primary index keys would then best be likely represented as pointer tuples, with some minor speed size data duplication in context.

An extends chain pointer and a first of extends is not required, as the next length B-tree will part index all extenders. A root pointer to the extenders and a secondary B-tree on each entry would speed finding all suffix or contained in possibilities. Of course it would be best to place these 3 extra pointers in a parallel structure so not to be data interleaved array of struct, but struct of array, when dynamic compilation of atomics is required.

The find performance will be slower than an uncompressed B-tree, but the compression is useful to save storage space. The fact that the memory is used more effectively when compression is used, can sometimes lead to improved find performance for short matches, with a high volume of matches. An inverted index can use the position index of the LZW symbol containing the preceding to reduce the size of the pointers, and the BWT locality effect can reduce the number of pointers. This is more standard, and combined with the above techniques for sub phases or super phrases should give excellent find performance. For full record recovery, the found LZW symbols only provide decoding in context, and the full BWT block has to be decoded. A special reserved LZW symbol could precede a back pointer to the beginning of the BWT block, and work as a header of the post placed char count table and BWT order count.

So finding a particular LZW symbol in a block, can be iterated over, but the difficulty in speed is when the and condition comes in on the same inverse index. The squared time performance can be reduced? Reducing the number and size of the pointers in some ways help, but it does not reduce the essential scan and match nature of the time squared process. Ordering the matching to the “find” with least number on the count makes the iteration smaller on average, as it will be the least found, and hence least joined. The limiting of the join set to LZW symbols seems like it will bloom many invalid matches to be filtered, and in essence simplistically it does. But the lowering of the domain size allows application of some more techniques.

The first fact is the LZW symbols are in a BWT block subgroup based on the following characters. Not that helpful but does allow a fast filter, and less pointers before a full inverse BWT has to be done. The second fact is that the letter pair frequency effectively replaces the count as the join order priority of the and. It is further based on the BWT block subgroup size and the LZW symbol character counts for calculation of a pre match density of a symbol, this can be effectively estimated via statistics, and does not need a fetch of the actual subgroup size. In collecting multiple “find” items correlations can also be made on the information content of each, and a correlated but rarer “find” may be possible to substitute, or add in. Any common or un correlated “find” items should be ignored. Order by does tend to ruin some optimizations.

A “find” item combination cache should be maintained based on frequency of use and execution time to rebuild result both used in the eviction strategy. This in a real sense is a truncated “and” index. Replacing order by by some other method of such as order float, such that guaranteed order is not preserved, but some semblance of polarity is run. This may also be very useful to reduce sort time, and prevent excessive activity and hence time spent when limit clauses are used. The float itself should perhaps be record linked, with an MTF kind of thing in the inverse index.

VESA NET

VESA NET? An idea for a BIOS extension. A protocol for total removal of the video card from the server. The VESA frame buffer becomes virtual, and routed out only UDP to the default broadcast address. A listener on the network presents MAC address (maybe translated), say 256 screens (8 by 8), so that any maybe routed and zoomed full screen. Along with an SSH ability on the “KVM” box, the default console can be seen of a whole server array. The main purpose being to remove the graphics card from servers. Allowing an SSH via MAC address in the BIOS would seem convenient, but does have massive security issues, being inbound traffic. The printing of a key finger print on the default console assists in the concept of possible login for inbound traffic, and install of the virtual keyboard, and perhaps mouse for more socket removal on the server. Not a full spec, just a concept. The configuration of the VESA receiver to proxy important or even public screens of interest as a web forward, by maybe even including a virtual floppy disk in manor similar to the PROC fs, would also fit in the space available in a modern BIOS flash.

JDeveloper and Intel Python

The JDeveloper environment looks good. Nice work Oracle, and some of the Borland classic JBuilder. This tool look more like how I’d use an IDE. I’ve been looking at other technologies for computer development, and a recent Intel offering (for personal use free) is the MKL backed Intel Python. It needs at least an SSE4.2 supporting chip, but does have all that is needed to run the development on Xeon Phi Knights Landing. 72 cores and 144 vector processing AVX-512 engines. Multi Tflops stuff. For the developer this is perhaps the easiest way to start HPC, as through Cython and eventually C, the best performance can be had. Maybe FPGAs will help, and tools are available for that too. I’ve seen some good demonstrations, and maybe some clients with complex or hard problems would need this.

All this parallel stuff got me thinking of Kahan sums, and simulation of incompressibles by having a high speed of sound in a compressible, and the doing a compulsory diffusion to damp oscillation, and a pressure impulse (Pa s) handling of inertial failure of containers. It might reduce the non-locality of certain simulations, and actually act to simulate pressure hammer effects.

I’ve also recently got back into the idea of using Free Pascal for some of my projects maybe. There is now good JNI support, and even JVM targeting. I maen it’s very possible to use C for this kind of thing, but the FPC IDE and Lazarus are quick to build, with incremental unit compilation and many other features which make it good competition for general coding. Some would think it old hat, but the ease of use is excellent with much type checking, and no insistence on everything being a class. Units are very modular that way. The support for quite a few Pascal flavours is also good.

A Server Side Java Jetty Persistent Socket API is in Development

I looked into various available solutions, but for full back end customization I have decided on a persistent socket layer of my own design. The Firebase FCM module supplies the URL push for pull connections (Android client side), and an excellent SA-IS library class under MIT licence is used to provide FilterStream compression (BWT with contextual LZW). The whole thing is Externalizable, by design, and so far looking better than any solution available for free. Today is to put in more thought on the scalability side of things, as this will be difficult to rectify later.

Finding out how to make a JavaEE/Jetty Servlet project extension in Android Studio was useful, and I’d suggest the Embedded Jetty to anyone, and the Servlet API is quite a tiny part of the full jetty download. It looks like the back end becomes a three Servlet site, and some background tasks built on the persistent streams. Maybe some extension later, but so far even customer details don’t need to be stored.

The top level JSONAsync object supports keepUpdated() and clone() with updateTo(JSONObject) for backgrounded two directional compressed sync with off air and IP number independent functionality. The clone() returns a new JSONObject so allowing edits before updateTo(). The main method of detecting problems is errors in decoding the compressed stream. The code detects this, and requests a flush to reinitialize the compression dictionary. This capture of IOException with Thread looping and yield(), provides for re-establishment of the connection.

The method updateTo() is rate regulated, and may not succeed in performing an immediate update. The local copy is updated, and any remote updates can be joined with further updateTo() calls. A default thread will attempt a 30 second synchronization tick, if there is not one in progress. The server also checks for making things available to the client every 30 seconds, but this will not trigger a reset.

The method keepUpdated() is automatically rate regulated. The refresh interval holds off starting new refreshes until the current refresh is completed or failed. Refreshing is attempted as fast as necessary, but if errors start occurring, the requests to the server are slowed down.

The method trimSync() removes non active channels in any context where a certainty of connectivity is known. This is to prevent memory leaks. The automatic launching of a ClientProcessor when a new client FCM idToken is received, looks nice, with restoration of the socket layer killing ones which are not unique. The control flow can be activated and code in the flow must be written such that no race condition exists, such as performing two wrights. A process boot lock until the first control flow activator provides for sufficient guard against this given otherwise sequential dependency of and on a set of JSONAsync objects.