Protected Modes

Ah, the classic is data execution prevented? It stands to reason that return-oriented programming would like to execute data as code by placement on the stack for a return. The obvious solution to this is a split of the stack into two stacks, one for data and one for code. Tagging all registers in the processor as containing data or code, and having a protected mode “make code” instruction, along with enough instructions forcing an automatic demotion to data would then place on the right stack as the store register would have a target kind tag.

This exceeds regular page virtual memory protection, and the supervisor bit makes for 4 stack pointers. This might be suitable for incorporation into a RISC-V processor but that all depends on the specifics of what register is the stack pointer (convention dictates sp is x2 but this can’t be certain in hacker code assemblies). And precisely how is target setting the alternate stack pointer is achieved.

In this way, it would be impossible to return to a data stack item and hence perform code address arithmetic on stack items. As per the RV64G extension practice of RISC-V, the custom-0 space is assumed until any ratification process solidifies the instruction encoding and so SXi is the extension name. 

It might be possible to fit the whole functional extension into the M extension space funct3 remaining RV64M combinations. 001, 010 and 011. These are all dual input single target register operations in R-type format. Knowing that demotion of code pointers to data pointers occurs on arithmetic and some other instructions (to be determined) automatically it needs documenting but no extra demotion instructions are necessary. They all set an internal state such that mcause is set to illegal instruction (2) when not in a safe supervisor mode and there is an indeterminate condition on which of the code or data pointer registers is to be used.

So each register is marked as containing code or data by a bit, and each integer register is duplicated for when it is used as a load or store address. It is chosen to address from the pair (regular and code shadow) based on the bit classification of the datum source or target value register (regular only).

001: This works opposite to 010 by moving a code pointer to a data pointer with an offset (rs2 slot). This is not protected obviously as it becomes data.

010: The instruction works like an add where rs1 is the data register, and rd is the code register with rs2’s slot being a 2’s complement sign-extended offset immediate to add. It is protected by supervisor exception as certain code targets are what need protecting.

011: This is a convenience instruction to move a code pointer to a code pointer (shadowed register for each general register when used as an address for all loads and stores while the stored register is flagged as code and not data), with an offset in rs2’s slot. It is protected by supervisor exception.

Yep, the 32 classification bits need to be stored somewhere, and an extra status register is needed. It has to be protected of course and should be called mdata.

Memory protection through the page table then isolates the differing actions of code and data pointers used for stacks. x0 still returns zero when used as an address (from the shadow value). JAL and JALR then use the stacking information too to target a PC as code and the correct stack. It could be considered a supervisor exception state if the link register is data.

Looking for open security analysis.