For the ubiquity you need it to be free and extremely simple to install. Apple gave HyperCard away for free preinstalled on its machines. Ideally, you would persuade major manufacturers to do the same with your proposed replacement (good luck with that!).
Technologically, you need representations of 8 concepts:
Cards and backgrounds are layered, visually. The contents of a card always appear on top of the contents of its background.
There are also layers within each card and each background. The bottom layer contains graphics (in hypercard that meant a single layer of black-and-white paint pixels), and the other visual objects are stacked on top of the graphics in the order they're added. You need some kind of picture-editing tools to make these graphics. HyperCard had a sort of mini-MacPaint built in for this purpose.
Each stack represents a containment hierarchy: the stack contains all the other objects; backgrounds and cards contain fields, buttons, backgrounds, and scripts. The stack knows the container of each object. When an event occurs, a sequence of objects has the opportunity to handle it, starting with the object in which it occurs. For example, if you mouse down on a button, the button is offered the event first, then its card or background, then its stack. The containment hierarchy acts like nested lexical scopes. Scripts can always see the entities that are present in the object to which they're attached, plus all of the nested containers that contain it.
You need a file format that can store all of these things conveniently. HyperCard put all of them into a single file per stack, which was especially convenient. To the user, the file was the stack. Simple. That's what you want. An obvious choice would be to store a stack in a sqlite file, but be prepared for a lot of work on the schema to get things laid out and working right.
You need a scripting language in which all of those objects mentioned above are first-class named classes or prototypes that can be instantiated. HyperCard used a purpose-built language called HyperTalk, which was designed to approximate colloquial English for the sake of approachability. You probably don't need to do that. You could probably use something like Lua or Python. You would need to build a comprehensive library to represent and operate on all the standard objects.
You need a top-level program that runs the stacks. Stacks should be designed to be self-contained, so that the top-level program doesn't need anything else to run a randomly-chosen stack. Stacks can refer to each other, but if you make a stack that does that and try to use it in a context where you've forgotten to include the other stack, that's your lookout.
Good luck!