Monday, May 10, 2010

Too much to learn for Programming? - Execution environment

There are many popular languages for programming and to learn only a subset of them from scratch is a nightmare. The fact that each language needs some tuning on each platforms makes it more complex. Therefore, programming concepts should be learnt first on an abstract level and adapt those concepts to specific platform when need. A programmer should have fair level of concepts on the following areas to master any programming language as well as leverage knowledge transfer to other programming languages:
  1. Execution environment
  2. Memory Architecture and Allocation/Deallocation
  3. Object Model
  4. Algorithms, Complexity and Data Structure
  5. Language Syntax
  6. Compilation & Linking
  7. Packaging and Deployment
Next few days, I will briefly explain each of these areas and today start with the first one – execution environment.

EXECUTION ENVIRONMENT

Execution environment (EE) is a layer over the hardware on top of which your programs run. It is very often the operating system, but could also be other platforms like virtual machine (JVM or CLR) or even browser (e.g JavaScript) or other applications (e.g. VBA on office application). Your program must have one or more target platforms. Software targeting for one platform may not run on a different platform.

There are various constrains of EE e.g. OS version (32/64-bit) compatibility, requirement of certain run-time libraries etc, some supporting libraries. Some software are OS agnostic, but depends on some other execution environment layer over OS. Java and .Net programs are the best examples of such runtime environment. It is important to understand the inner workings of an execution environment if you want to be expert programmer. How EE understands your file, loads it and what information your executable file must provide so that EE is able to load it.

An example will make it clear. Windows OS considers a file executable if it has some specific extensions (e.g. .exe, .bat, .com etc.). But when the OS starts to load the file, it looks if the file has certain information at predefined location to set up the correct execution environment (DOS, Win32/64, WOW64 ). The .exe executable file is mostly of the portable file (PE) format which is derived from Common Object File Format (COFF). Let's examine the steps in brief that windows OS pass through to load an executable file:

  1. Validate parameters and flags that the shell (or programmatic CreateProcess call) specifies.
  2. Open image file (.exe).
  3. Create a process.
  4. Create initial thread.
  5. Perform other initialization.
  6. Execute initial thread.
  7. Load required DLLs and start execution of initial thread.
  8. Eventually the entry point function (main/wmain/_tmain for console and WinMain for GUI application) is executed.
So this is a fair amount of work for the OS to load and run an executable file.

The process is much different for a so called platform independent programs. During compilation a .Net compatible compiler adds reference to the MSCorEE.dll which is referred as Microsoft .NET Runtime Execution Engine to the generated PE or PE+ (for 64-bit) executable file. The steps for loading and running a .Net image are roughly described below:
  1. 1..6 of the native image load described above.
  2. Loads MSCorEE.dll in the process address space.
  3. The initial threads starts executing calling a method in MSCorEE.dll which initializes CLR, loads EXE assembly and then calls its entry point method.
The process is more straight forward for scripting code. The host application is responsible for providing memory and object model to the script code. The capability of script programs is also more restrictive than native and VM programs. For JavaScript code, the browser has a JavaScript engine which parses, loads, and executes the code.

No comments:

Post a Comment