9.15.2011

A bytecode thread system

A lot of things changed changed me and my attitude the last days but I still had my ideas flowing, so my ability to design stuff didn't change (which's definitely a good thing). I was always annoyed by how few you know when using threads within operating systems: How does it handle switching? How can you guarantee a certain time or percentage of computation power used? Can you rely on multi-processor usage or are you just getting another interrupt-controlled multiplexing? Well, I won't be able to change this, but I got a great idea for implementing a completely OS-independent way of enabling my bytecode interpreter to support multithreading. All in all you just describe a set of "process blocks" which can also consist of other process blocks (if not child blocks, then it'll be an executed function). The whole set including all process and child process represents a time slice with quantisized durations of all the same size representing the moments whereafter a task switch can happen. So each process can either have a fixed number of durations, a percentage of the whole slice or a flexible number of durations (taking the durations left) til switching to the next tast in the list. Due to the nature of beeing able to build process blocks, using a percentage will always scale to the size of the containing process block. All processes with in the last mode, flexible amount of durations, will share the lasting time block after subtracting and fix and percentage mode durations. This does give full control about the time slice, but is insufficient for newly coming threads and flexible task starting in general. Thus, I made the value for fix and percentual having a minimal and a maximum value: this way the scheduler will first assign all minimal times and expanding them depending on a set preference order (fixed first, percentual second and flex last) and the order all processes of the same kind are sorted in their corresponding list. I especially like this concept because of it's sensitive time sharing nature: Imagine a game running with a renderer, a physics process and input as well as a flexible numbers of processes used for path finding, more complex AI requiring quite an amount of time. A renderer might be finished quite fast or take a slower time, resulting in less calculated frame per second if it's max time was spent. The same is with physics, but requires a bit of synchronization with the input. All in this will result in bigger and smaller time gaps you can fill with a flexible number of long AI operations: either you want them to be done as fast as possible or you put them in a fixed duration window and give each after key operations a certain delay to mimic longer ingame AI time not interfering with the actualy calculation time to simulate "thinkage delays". The idea is simple: Use the same setup for a plethora of scripting processes and time-demanding operations together in the same interface for finegrain performance control. Depending on the technology used, there's surely also room for making it adjustable to the user, tweaking it so that calls to a slower graphics card (disabling process interrupting) can take more time if necessary but CPU operations for physics take less due to faster execution for this task. Sure might this be rather useless since it would always take it's time to finish, but if done with very short and small calls to functions outside the bytecode interpreter, it's very responsive and able to still guarantee quite fine-grain control. You know what? I guess it's time to seriously think about using it as my language for programming my game renderer, again. I think I can split the whole thing up, so that even multiple interpreter can run along and communicate, giving well-enough seperated processes able to perform realtime tasks usually only successful with C code. The needed bytecode is primitive to interpret with a new calling method I found for it - it will only scan through the whole code once after loading and relocating opcodes to actual function calls to avoid iteration of opcode numbers. This makes builting functions endless and new ones can be generated runtime if needed. Yep, it's time to start this one and make awesome things happen. This surely won't be any less fast than interface-filled C++ code (ok, not THAT fast, but definitely faster than purely bytecode-interpreted Java).

Hooray! The armies of hell all combined with a single, nice package for my utmost convenience.

No comments: