Sunday, November 29, 2009

Structure of an Object Oriented Program


This post is part of my ongoing effort to familiarize the readers with the commonalities & variabilities of C++/Java/C# programming languages. A tentative outline of this series can be found at Contents.

In the last post of this series, we have seen how the structure of a very simple program in C++/Java/C# looks like. The C++ program was written in procedural manner, since at least the entry point function main must be global (not member of any class). Java is a near pure OO language (apart from the non-object primitive datatypes like int, long, float). So every method must be part of a class. Many argue that C# is a pure OO language since its primitive types are implicitly derived from System.ValueType which is, in turn, derived from System.Object. Anyway, like Java all data members and methods in C# belong to some class. In today's post, we will see the various elements of an OO program, their commonalities and variabilites.

Namespace/Package: Sets context for an item

A namespace groups a set of related elements and guarantees identity of their names. It draws a conceptual boundary around the items which distinguishes them from elements with the same name in other namespace. It is not mandatory to use namespace, but absence of it can confuse you even for moderate size programs. If you are a library programmer, you should use namespaces, otherwise user of the library can easily be confused due to frequent compiler errors caused by duplicate names. The namespace is hierarchical in nature. You can have a namespace containing another namespace.

Let's take a code snippet as an example. The following C++ code will not compile due to duplicate identifiers.

int doSum(int x, int y);
int doSum(int x, int y);


If you really need the same identifier name, you can distribute them in different namespaces.

namespace Aggregate
{

    int
 doSum(int x, int y);
}


namespace Separate
{

    int
 doSum(int x, int y);
}

Java has implemented namespace concept bit differently than C++/C#. Java uses package to organize the classes of the program and defines the hierarchy how the class files will be deployed in the file system of target machine. Package inherently defines the namespace. A package name java.lang has the following meanning:
  1. java.lang must be the first executable line in the source file
  2. java.lang is a namespace
  3. java is a namespace
  4. There exists a directory named java (may be only virtual if e.g. inside a .jar file)
  5. There is a subdirectory named lang inside java (java\lang)

Using Namespace/Package: The namespace (in C++/C#) or package (in Java) must be imported before they can be used in the source. The using namespace keyword serves the purpose in C++ and C# while in Java import keyword must be used. You can also obviously use the fully qualified name without using the keyword in all these languages.
C++
Definition:
namespace HexEditorApp
{

  public
:
      int
 count = 10;
      int
 doSum(int x, int y)
      {

        return
 x + y;
      }
}

Usage:
using namespace HexEditorApp;

public class
 HexConverter 
{
 
public
: 
    void
 Convert() 
    {
 
        int
 countElement = HexEditorApp::count; // fully qualified name 
        int sum = doSum(10, 20); // call directly, as the method is already imported 
    } 
}

Java
Definition:
package HexEditorApp; // it must be the first executable line of the file 

/* In java (as well as C#), you can't declare a variable or method outside a User Defined Type (UDT) definition e.g. class, enum, struct (for C#)*/


public class
 HexEditor {
  public
 int count = 10;
  public
 int doSum(int x, int y){
     return
 x + y;
  }
}

Usage:
import HexEditorApp; // import keyword for type inclusion

public class
 HexConverter {
  public
 void Convert() {
     int
 countElement = HexEditorApp::count; // fully qualified name
     int sum = doSum(10, 20); // call directly, as the method is already imported
   }
}

C#
Definition:
namespace HexEditorApp
{


/* In java (as well as C#), you can't declare a variable or method outside // a User Defined Type (UDT) definition e.g. class, enum, struct (for C#)*/


    class
 HexEditor
    {

        public
 int count = 10;
        public
 int doSum(int x, int y)
        {

            return
 x + y;
        }
    }
}

Usage:
using namespace HexEditorApp;

public class
 HexConverter
{

    public
 void Convert()
    {

        int
 countElement = HexEditorApp::count; // fully qualified name
        int sum = doSum(10, 20); // call directly, as the method is already imported
    }
}

Definition vs. Declaration: Make identifier known to compiler w/o values
We declare an identifier to make it known to the compiler so that we can use it afterwards. But this is only the half of the story. With definition, we give a concrete meaning to this identifier.

// declaration of variable
int count;
// definition of the variable
count = 0;
//declaration + definition of variable
int count = 0;

//declaration of a method
int doSum(int x, int y);

//definition of a method
int doSum(int x, int y)
{

     return
 x+y;
}

Both Java and C# assign default value to any primitive type declaration i.e. these identifiers are defined in place of declarations (more on this in the next post). These languages also differ from C++ on how they define methods. In C++ it is possible to declare the methods at one place and define them at some other place. But in Java and C#, you must define the method at the place of declaration.

Saturday, November 21, 2009

Running Notepad++ on Linux

I use Windows and Linux interchangeably. Windows 7 is currently installed on my office computer and Kubuntu 9.10 (codenamed karmic) on the laptop. I need very often similar applications on both platforms. For simple .Net development MonoDevelop is my choice which is cross-platform so doesn't need any special treatments. Eclipse IDE serves me the same for the Java development. For my blog writing, specially for C++/Java/C# posts, I need to write C++/Java/C# code very frequently and use Notepad++ in Windows. It helps me to write source code in the desired language and export to basic HTML with simple embedded style.

I needed a similar editor on Linux. I could have used cross-platform ports of emacs or vi for the same purpose. Being a long time Notepad++ user for HTML and blogging, I don't want to use emacs or vi for blog posts. I have briefly explained here the steps to install Notepad++ in the Kubuntu system over Wine.


1. Install Wine if it's not installed yet
      $sudo apt-get install wine
2. Download Notepad++ windows executable from its download site.
3. Open the console and change to the download location.
       $cd

4. Run the installer with wine
      $wine npp.5.5.1.Installer.exe

Now I can use notepad++ from windows as well as from kubuntu.

Friday, November 20, 2009

Identifiers


This post is part of my ongoing effort to familiarize the readers with the commonalities & variabilities of C++/Java/C# programming languages. A tentative outline of this series can be found at Contents.

I want to start today's post with a good news. Microsoft has announced to open source .Net Micro Framework under the Apache 2.0 license. It is a big news for Open Source enthusiasts. MS has come forward a step closer to open source the complete .Net Framework in the future. Does it mean the beginning of the end of Mono? I believe Mono is more than .Net framework. Our cross-platform hexeditor will conver both .Net and Mono platform, so it is a good experiment to distinguish the commonalities and variabilities of these two platforms.

Today's post will be short. I will explain the most important constituents of programming syntax - the identifiers. We need words in natural language to communicate with each other. Similarly, in programming languages Identifiers are used to communicate with the system. The compiler gives us a set of names called Keywords which have predefined meanings. In contrast, identifiers are the names we supply to our programs. The identifiers and keywords form statements that define the syntax of the program similar to sentences of natural languages. The identifiers must be different from the keywords. So it is important that you have a fair amount of knowledge on the keywords in your programming language.

There are various types of identifiers we use for naming the entities of the programs - variables, function names, constants, user-defined types, labels etc. Most programmers follow some convention for naming them. There are some well accepted naming conventions in different languages for identifiers e.g. - Pascal Case, Camel Case, Hungarian Notation. There are coding guidelines from Microsoft for C# and from Sun for Java. Many programmers who code both in Java and C++, uses the Java conventions. If I can hold my energy, I will write a complete post on various coding conventions.

Identifier naming rules common to C++/Java/C#:
  1. case sensitive
  2. only characters(A-Z, a-z), digits(0-9), underscore(_) can be used
  3. can't start with a digit
  4. no space is allowed inside the identifier name
Examples:
   doSum, _tryit, transform3D
   do Sum (space inside), /tryit (invalid char), 3DTransform (can't start with digit)



Wednesday, November 18, 2009

3D Line chart with JavaScript

For a recent project, I was looking for a free 3-dimensional line chart library in JavaScript. There are a lot of server side charting applications both free and commercial. I have been using Google Charts API for quite a long time. But I had a very unique requirements that my web application must be highly interactive. So user should be able to click every single line on the chart and show/hide certain information.

First thing comes in my mind to program some JavaScript code for drawing operations. So I created a simple drawing library. But later I noticed, jsdraw2d does much better job. It is open sourced and good tested. So I have decided to use it instead.

I always consider OO important for any of my projects. In my web applications, I frequently use some wrapper library around JavaScript's prototype-based programming model. I like Mootools over Dojo, jQuery or YUI due to its pure object orientation appeal. The motto of Mootools states in its web site — ”MooTools is a compact, modular, Object-Oriented JavaScript framework designed for the intermediate to advanced JavaScript developer. It allows to write powerful, flexible, and cross-browser code with its elegant, well-documented, and coherent API”. Furthermore, it provides a large set of plug-ins for the development of many GUI components such as Progress Bar, Drag-Drop, Slider, Tooltip etc.

The library with all dependent files can be downloaded from the project web site at here. You can also browse the source at google code. Good luck with 3d-charting in JavaScript.

Show Example

Tuesday, November 17, 2009

Writing your first program - Hello World


This post is part of my ongoing effort to familiarize the readers with the commonalities & variabilities of C++/Java/C# programming languages. A tentative outline of this series can be found at Contents.

Enough words. Now it is time for some code. We are going to write our first program. Fire up your favorite text editor. We will write code in this simple editor till our work grows to several files. If you already have some IDE for the intended language, you can use them as well. Here we go-
  • C++
  • Java
  • C#
#include
using namespace std;

int main()
{
cout << "Hello World!\n"; }

namespace HelloWorldApp
{
public class HelloWorld
{
    public static void Main()
    {
        System.Console.WriteLine("Hello World!");
    }
}
}

Each program, when compiled, specifies a special function as the Entry Point. When a program starts, the platform looks for this function and all other functionalities must be reachable from this point. C++, Java and C# differ how they specify the entry point function, even though the standard entry point function is named main1 in all these languages.
  • C++
  • Java
  • C#
int main(void);
int main(int argc, char *argv[]);
public static void main(String[] args)
public static void main(String... args)
static void Main();
static void Main(string[] args);
static int Main();
static int Main(string[] args);
1. You can specify other function as Entry Point though this is very rare.

Compilation:
I have already discussed various compilers and IDEs for C++/Java/C# in the previous post. Now its time to compile and run our first program.
  • C++
  • Java
  • C#
Windows:
cl.exe helloworld.cpp
Run the program
helloworld.exe

Linux (GCC): There are 3 variants to compile a C++ program with gcc -
  1. gcc helloworld.cpp -lstdc++ -o helloworld
    gcc is usually used for C compilation.
    -lstdc++: compile in C++ mode.
    -o: indicates the output filename. If omitted, output filename is always a.out
  2. g++ helloworld.cpp -o helloworld
    uses gcc to compile in C++ mode.
  3. c++ helloworld.cpp -o helloworld
    Most systems install this program which is identical to g++.
Run the program
./helloworld
Sun Java:

javac HelloWorld

GCC (very rare):

gcj --main=HelloWorld HelloWorldApp.java -o HelloWorldApp

Run the program (Remember Java as well as Linux-File-System is case sensitive)
java HelloWorld
Windows (MS SDK):
csc.exe HelloWorld.cs
Run the program HelloWorld.exe

Linux(Mono):
gmcs HelloWorld.cs
Run the program ./HelloWorld.exe


Source Code for the Posts:
I have hosted a project in google code. The project page is cpp-java-csharp. Click the source tab. You can browse the source code by clicking browse tab or can download to your machine by clicking Downloads tab.

Structure of the Source Code:

As discussed in the previous post, I will use the following compilers and IDEs for our exercise works-

C++
  1. Windows:
    Visual C++ 2008 Express Edition
    Just open the solution file on Visual C++ 2008 Expression Edition or Visual Studio 2008. Compile and run.

    Command Line
    A batch (.bat) file will accompany each project. You need to run this batch file which will generate the executable .exe file. I assume that you already have the C++ compiler (cl.exe) in the path.
  2. Linux:
    Command Line
    A shell script (.sh) will accompany each project which will generate the executable file. The gcc must be in the path. Change the mode of the .sh file to executable with the following command:

    sudo chmod +x xyz.sh

    Run the file with the following command if you are in the project directory:
    ./xx.sh

Java
Eclipse:
  1. First create a new Java project in your workspace.
  2. Click "File->Import..." menu item.
  3. Import the project from the downloaded package
Netbeans:
Open the project from the downloaded package with "File->Open Project" menu item.


C#
Windows:
Visual C# 2008 Express Edition
Same as C++

Command Line
Same as C++

Linux:
MonoDevelop
Open the monodevelop project in the IDE. Compile and Run.

Command Line
A shell script (.sh) will be provided with the project. Consult C++ section to know how to run shell script in console.

Monday, November 16, 2009

Setup development environment


This post is part of my ongoing effort to familiarize the readers with the commonalities & variabilities of C++/Java/C# programming languages. A tentative outline of this series can be found at Contents.

Fasten your seat belt. Now its time for something practical. We will start with setting up the environment for software development in C++, Java, and C# languages. which will be followed by our first program "Hello World". The next posts will cover the various syntactical differences of these languages. On the journey, we will gradually develop the hex viewer that I promised to program as part of these blog posts. I will also briefly explain various tools, techniques, libraries, and technologies for the cutting edge software engineering at appropriate places.

There are numerous compilers available for each of these languages. Depending on operating systems and distributions, the installation process of these compilers varies enormously. You should consult your OS documentation to find an appropriate compiler. Most of these compilers are accompanied by feature-rich integrated development environment (IDE) that usually consists of a text editor with syntax highlighting for the source code and a collection of tools accessible directly from this text editor. The most common tools are - compiler for obvious reasons, debuggerto check programmatic errors, management of sources files, numerous wizards for target-oriented automatic code generation, automatic unit testing etc.

We will see here the most widely used IDEs and compilers on Windows and Linux. Some of them are available for both operating systems and/or can generate cross platform binaries. I have listed them under cross-platform tag.

  • Windows
  • Linux
  • Cross-Platform
Microsoft Windows SDK: Microsoft provides a comprehensive set of compilers for numerous programming languages as part of Microsoft Windows SDK. The C++ compiler that it supports is called Microsoft Visual C++ (MSVC), which was aimed to visually write C++ code similar to Visual Basic. The newest version is highly compatible with C++98 standard and also supports most of C++0x, the upcoming C++ standard. MSVC can compile both in C and C++ mode. The C compiler supports the original C89 standard along with some features of C99. It includes a lot of Microsoft-specific functions. So special care must be taken for cross platform programs. The MS-specific functions might not compile in other compilers.

Windows SDK includes a C# compiler called csc. Microsoft is the inventor of C#. So the compiler that is shipped with Windows SDK is the most standardized and latest one. The C# 3.0 is released with the .Net Framework 3.51. The upcoming .Net Framework 4 will include C#4.0.

Microsoft has provided a Java compiler named Visual J++ for many years. There were lots of legal tussles over Microsoft's use of Sun Java technology till the 2001 settlement. Since the introduction of .Net technology, MS has transformed its J++ transparently into Visual J# which supports writing Java source code to build applications and services on the .NET Framework. As of January 10, 2007 Microsoft has discontinued visual J# offering. So I would rather not use J# for future Java development.

The windows SDK is free to download and you can use your favorite ASCII or Unicode text editor (e.g. Notepad) to write programs and compile them from command line. Prior to Windows Vista, the SDK was called Platform SDK. The latest SDK can be downloaded from Microsoft Windows SDK for Windows 7 and .NET Framework 3.5 SP1. Follow the instructions that accompanied the downloaded package to setup the programming environment.

C++
Open a command prompt (Start -> Accessories -> Command Prompt) and type the following:

cl  /help

It will show you various command line options that are available for C++ compiling.
Java
Write following command from command prompt:

vjc  /help

It should show you the options for J# compiling. Remember J# will compile your Java code to MSIL for .Net runtime, not Byte code for Java Runtime.
C#
Start the C# compiler from command prompt:

csc  /help

The options for C# compiler will be displayed.


Microsoft Visual Studio: Microsoft Visual Studio is undoubtedly the most popular IDE in the world. Its current version (VS2008) and upcoming version (VS2010) have a powerful set of tools for application development. There are hundreds of plug-ins available both freely and commercially to enhance it capabilities even further.

You need something useful from Microsoft, you have to pay for that. Luckily, a stripped down version of visual studio is made available for the hobby programmers. It is called Visual Studio Express Edition. You can do a lot of programming with it. There are separate versions of express editions for each of Visual C++, Visual Basic, and Visual C#. The expression edition is available at http://www.microsoft.com/express/download/ for download. Download and install your preferred language version.

SharpDevelop is another IDE written in C# for C# compilation. It is open source and can be downloaded from http://www.icsharpcode.net/OpenSource/SD/. Last time I used it, it was incredibly slow even for a moderate size project.
Unless you are a comnand line guru, you need some IDE. Buy Visual Studio 2008, if you can. Otherwise, download express edition or some free alternatives. Also see cross platform tab.
GCC:
GNU compiler collection (GCC) is the most favorite C/C++ compiler among the open source programmers. It laid the foundation for the free software movement during the 90s and still the dominating compiler collections in the Linux operating system. Even though it is free, it provides a comprehensive set of tools and extremely flexible compilers for cross platform application development.

If you are using a Linux it is very usual that you already have GCC installed. Otherwise, you can install preconfigured and prepackaged version for your distribution using the package manager of your system. I am a Ubuntu user. For me the commands are:

sudo apt-get install gcc
sudo apt-get install g++

If you are a Linux geek and want build your own version, download the latest version of GCC from ftp://ftp.gnu.org/gnu/gcc/. You have to go through the usual steps of software building in Linux system: Configure -> Build -> Install. Enjoy !!!.

GCC also includes a compiler for Java called GCJ. It was drawing attention when sun java was not yet open source. But now-a-days, most distributions prefer sun java over gcj.

Now check whether your installation is working properly. Open up a terminal and run the following commands-
C++
The command for C++ compiler is g++. Run in the terminal -

g++   --version

You should see the version information for g++.
Java
If you want to use the java compiler of GCC, run the following command in a terminal -

gcj   --version

It should show you the version information of the gnu Java compiler.

For Java and C# on Linux, see cross-platform tab.
Many *nix enthusiast use generic editor like emacs and vim as IDE. For popular IDEs on Linux see cross-platform tab.
Virtualization:

It is the process of running one operating system on top of another. You can run Linux on your windows machine or vice versa with appropriate virtualization software. VMWare is the market leader providing virtualization software for all popular OS. VirtualPC is another virtualization software from MS, but it supports only one version of Windows to run on another. Sun's VirtualBox is also a popular alternative. It has a open source version called VirtualBox OSE which can be freely downloaded and installed.

GCC:
GCC is originally intended for the *nix based program compilation. But there exist various ways to use it on Windows OS. I have listed some of options available to run GCC in Windows OS:

Emulation:Cygwin (Gnu + Cygnus + Windows) is a Linux API layer providing Linux functionality. You can install cygwin in your windows machine to easily use your familiar linux commands in windows. I have used this emulation layer for quite a long time in my office PC where Linux installation was strictly forbidden. You can compile your programms with GCC under cygwin same way as you do it in Linux.

Software Port:MinGW (Minimalist GNU for Windows) is a port of GCC compiler to build applications for Windows. It provides a comprehensive set of libraries for native windows program development.

Java Development Kit (JDK):

You need JDK to write and compile Java programs. In Windows, you can decide to use J# for Java code in which case your program will be a .Net program and run only on .Net runtime. Download the latest JDK from Sun Java Download Site and install it in your system. If you are a Linux user, you can decide to install the prepackaged version from your distribution. Ubuntu/Kubuntu user can look at Installing Sun JDK 6 in Ubuntu/Kubuntu Interpid.

Eclipse:

Eclipse is the most favorite IDE for Java based application development. It has a plug-in based architecture. Eclipse defines only some core functionalities by default which are enhanced by plug-ins for the intended development environment. The eclipse web site already provides prepackaged plug-ins for various programming languages. For cross-programming environment you just need to install the required packages from its Help->Software Updates... menu command.

The installation of Eclipse is the simplest of all. For Windows OS, download your intended package from the above link, uncompress it, hurrah! you are finished. Now run <directory_for_uncompress>\eclipse.exe. The procedure is also same for the Linux. Just in case, you need the commands for uncompressing look at Most Useful Linux Commands. If you are using some popular distribution, you can also use the respective package manager so that your the installation and uninstallation can be manipulated by the package manager front-end. In my kubuntu system, the command is-

sudo apt-get install eclipse

Just remember, your distribution might not provide the latest version of eclipse at the time of installation. For more information regarding installation of eclipse and Java in Ubuntu/Kubuntu Interpid, look my old blog post at "Set up Java Development Environment in Ubuntu/Kubuntu Interpid".

If you want to install other development environment, click "Help -> Software Updates..." menu command and choose the appropriate programming language like in the following figure.



Netbeans IDE:

Another very famous cross-platform development environment is Netbeans. It has many cool features which made it an attractive Java IDE. There are also plug-ins for other programming languages like C++. It is equipped with a huge set of wizards for generating codes for frequently needed projects and code modules. Download the latest version from http://netbeans.org/downloads/.

Remember for C/C++ development with Eclipse and Netbeans IDE, you have to provide a C++ compiler to them. On Windows, you can use Microsoft C++ compiler or Intel C++ compiler. You can even use GCC over MinGW or CygWin described earlier.

Mono and MonoDevelop

So you love .Net for its cross-language feature or C# for its powerful but simple syntax. But you are a Linux user or your program should run both on Windows and Linux. This can frequently happen if you develop ASP.NET program and want your web application to be hosted both on Apache and IIS. You need Mono. This is an open source implementation of Microsoft's .Net Framework based on the ECMA standards for C# and the Common Language Runtime for Windows, Linux, and Mac OS X. It currently supports .Net Framework 2.0, compiler for C# 1.0, 2.0, and many of 3.0 features. MonoDevelop

Mono framework is complemented by a cross-platform IDE which runs on Windows, Linux and Mac OS X. It is a full-featured IDE and you can develop .Net programs which run on all these platforms.

Windows installation of Mono and MonoDevelop are provided through MSI packages. Download them and click. You should be provided with usual Windows Installation procedures. Many linux distributions are currently providing prepackaged Mono and MonoDevelop. Follow your package installation procedure to install and configure them. Installing Mono & MonoDevelop is simpler to install in Ubuntu than in Windows once you have an active Internet connection.

1. Open console
2. sudo apt-get install mono-runtime
3. sudo apt-get install monodevelop

Use either Eclipse or Netbeans for Java and C++. MonoDevelop for C#.


For beginner, I can only recommend the following from my personal preferences:
C++
Windows:
Visual C++ 2008 Express Edition

Linux:
Eclipse for C++
Java
Eclipse or Netbeans
C#
Windows:
Visual C# 2008 Express Edition

Linux:
MonoDevelop


Friday, November 13, 2009

Case Study


This post is part of my ongoing effort to familiarize the readers with the commonalities & variabilities of C++/Java/C# programming languages. A tentative outline of this series can be found at Contents.

Programming needs exercises. You can be a good Software Engineer without coding, but to be a programmer you need to write code. Continuous exercise can make you better (nobody is perfect in programming). For all these three languages, we will develop a simple hex viewer program. It should load a file and view the binary contents of it as hexadecimal values. The user should be able to select alternative view types e.g. octal, binary or decimal.

I have sketched a class diagram of the program which is depicted in the following:


HexViewer: It is the main class and will load a file, adjust the size of the display window and manages other objects to print the contents of the file on the window.

DataViewer: It is responsible for printing the hexadecimal contents.

HexConverter: It converts the binary contents of the file into hexadecimal values. The program can be extended with introducing more converter types like Binary converter of binary display, octal converter for octal display, ascii converter for ASCII display.

In my other blogs related to software engineering and test driven development, I will elaborately explain the various stages of SE-lifecycle for developing a software illustrating this program as an example.

Monday, November 09, 2009

Programming Paradigms


This post is part of my ongoing effort to familiarize the readers with the commonalities & variabilities of C++/Java/C# programming languages. A tentative outline of this series can be found at Contents.

The first recognized programming language arrived in 1950s. O'Reily has a nice poster on the history of programming languages. At the beginning programs were very monolithic. The data was global and available to the whole program. Modification to a data item affected the entire program. It was extremely hard to maintain a large piece of code. Various paradigms have been introduced over the years to ease the development of complex programs with higher maintainability and reusability of code. There are other paradigms which represent different concepts for the programmatic elements. In this series of post, we will very frequently state procedural and object oriented programming paradigms. So they demand a bit more explanation.

  1. Procedural
  2. Programs are developed in a collection of procedures or sub-routines. There is an entry procedure (e.g. main in C) that calls other procedures to perform the tasks. The procedures form a chain and together accomplish the intended operations.
    Procedural programming is a type of Structured programming where each procedure has local data and modification of the local data does not affect outside of the procedure. But procedure has access to global data and can modify them which in turn may change the state of the entire program.

  3. Object Oriented
  4. Not surprisingly objects are the most fundamental components for OO programming. An object is a data structure consisting of data and a set of functions to manipulate that data. OO defines a set of objects and interaction among these objects. In the implementation the objects are defined in classes. An object is an temporal instance of a class and each object is unique in its running environment. I have drawn a simple class diagram (Fig. 1) and an example object diagram (Fig. 2) for students and teachers who supervise these students in a school.




 
    Principles of OO
    Abstraction: An skeleton view of the concrete object. Properties important for the external users are separated from the unimportant properties.
    Encapsulation/Information hiding: Implementation details are kept hidden from the external users such that changes should not effect the users.
    Modularity: System partitioned into highly cohesive and loosely coupled components.
    Hierarchy: Ranking or ordering of abstractions. There are a lot of sub-principles how to achieve these principle which is out of the context of this blog.

I will explain the implementation buildup of object orientation in our referenced languages when we will write our first program. The seminal book1 written by Grady Booch and co. is highly recommended for in-depth knowledge of OO concepts.

References

  1. Grady Booch, Robert Maksimchuk, Michael Engle, Bobbi Young, Jim Conallen, Kelli Houston. Object-oriented analysis and design with applications, third edition, Third edition. Addison-Wesley.

Sunday, November 08, 2009

Why to Learn Programming Languages


This post is part of my ongoing effort to familiarize the readers with the commonalities & variabilities of C++/Java/C# programming languages. A tentative outline of this series can be found at Contents.

Why Programming Language:
My first programming experience was a shock for me. On the same day when our first programming lecture in Pascal took place, we had to write a pascal program in the lab. I had no idea why I was doing what I was doing. My first tutorial was more shocking. Let's skip this now :)

We use natural language to communicate with each other. A letter is a written form to carry information to others which, if asked properly, may bring some responses back from the receiver. The language has some defined syntax (Grammar) which must be interpreted to some Semantics (meaning) by the receiver. So a common syntax and semantic pair is necessary to successfully establish the communication. The principle also holds for programming languages.

Machines are piece of hardware which understand only 0 and 1, called bits1. Sequences of bit combination form the byte code instruction and data which are encoded and decoded to perform the intended information. Now how can a human communicate with a machine? Programming language comes in action. Like natural language, a programming language defines some syntax with some intended semantics which is converted to machine readable format by the compiler.
A programming language defines syntax and semantics to instruct the computer to do some defined tasks.
Why C++:
C and C++ are definitely the most widely used programming languages on the Earth. If you are a computer science student whether you want or not, you will come across one or both of them in your career. So a fair knowledge of C/C++ is a prerequisite for a good software engineer. Since C++ is an extension to C, they are very often refer as a single language and called C/C++ (see-see-plus-plus)

With C/C++ you can control the system thanks to its support for low level capabilities even with inline assembly language 2 compiler. So for system level programming C/C++ is important. For driver and kernel development C/C++ are ubiquitous. Both Java and C# need a lot of resources to run the virtual machine on top of which your program will run. It is very frequent in the embedded system domain that you have insufficient resources to load Java or C# VM and you will prefer C/C++ for those systems. There are even special C/C++ compilers for those resource critical systems.

You need cross OS development in the Windows and *nix worlds. In addition, you want to perform system level operations. C/C++ can support you the best. There are various libraries available which let you build your program on multiple OS platforms. For Graphical User Interface (GUI), you can use QT (http://qt.nokia.com/) or GTK+ (http://www.gtk.org/) that enable you a consistent view of the user interfaces on many systems.

If you want your application to run super fast, you need C/C++. They are still the dominating languages in the game programming and 3D-Visualization domains. Though now-a-days, many popular games are coming out in other languages. I myself is currently working with 3D visualization in .Net platform using DirectX technology.

C++ is very frequently called a federation of languages rather than a single language. It supports various programming paradigms. It is a natural extension to C and Assembly. You can code procedural, functional, object oriented programs. Template Meta Programming (TMP) is a new programming paradigm that is made popular by its support for templates. C++ is undoubtedly harder than Java and C#. It has a high learning curve. You need many years of die-hard efforts to become expert in this language. Therefore, think first whether you really need C++ before you start learning it. But if you can reach at that level, you are a master of your systems.
For system programming you can hardly ignore C/C++.
Why Java:
Java is the most widely used language for cross platform development. "Write Once, Run Everywhere" is the motto of Java from the beginning. Though there are controversies how far it achieved that, nobody can ignore that Java's stronghold for the multi-platform development.

Java is a pure object oriented language. Every data entity and function belong to a specific class. There is no direct notion for scattered global data which has reduced complexity of procedural programming a lot. Garbage collector reduces the overhead of maintaining the program resources. The language defines a lot of standards and reference implementation which cover almost all aspects of programming concerns from simple desktop applications to large enterprise development. Since Java programs run on top of Java Virtual Machine (JVM), programmer has less control to manipulate the hardware directly which, in turn, provides a much higher security model for a system. Sun has, in addition, released the source code of complete Java stack as open source software under GNU public license reducing fear for vendor lock-in thus increasing its acceptability to the enterprises and open source enthusiasts.
Java is very strong in cross platform middleware development.
Why C#:
You are a Microsoft enthusiastic. You have almost no chance to program on other platforms. You are not forced to program in C++ by your employer. You don't need to control your system. You like pure OO programming language. Speed in development is important for you. Collaborative development of expertise on multiple languages is necessary for your team.You should use C#.

Microsoft has invested its large workforce for .Net development. C# is the main language for this framework. It is a very neat language. MS has tried heart and soul to reduce programmatic complexity in C#. Those who are disappointed with Java's lack of high performance visualization support, can effectively program in C# while equally enjoying advantages of managed environment. There is an easy migration path to transfer existing code base to the .Net platform. You can even run your code in dual (parts in managed and parts in unmanaged) environment facilitating gradual migration to the .Net platform.

Microsoft developed a version of C++ for .Net platform known as C++/CLI. It has simplified the execution of existing C++ components on .Net runtime. It is a combination of C++ and new constructs for .Net which has again steep learning curve. So for new .Net development, I discourage use of C++/CLI and rather recommend C#. Microsoft Visual J# is a Java compiler which compiles Java code for .Net runtime. But MS has suspended support of J# language in the future release of Microsoft Visual Studio. So it is not safe right now to start a new project in J#.

Though theoretically .Net is platform independent, MS released version 1.0 of the Shared Source Common Language Infrastructure, also called RotorVM which can be built only on FreeBSD (version 4.7 or newer), and Mac OS X 10.2. Commercial use of it is prohibited. Mono (http://www.mono-project.com/) is an open source implementation of .Net framework on Linux, Windows and Mac OX platforms. It currently supports C# 3.0, Visual Basic 8, and several other programming languages. Similar to J#, IKVM is a Java implementation built around Mono which compiles Java code to .Net platform.
C# is the premium language for the future application development on Windows
If you are a new leaner, you may start with a managed platform like Java or .Net. In this way, you can restrict yourself from learning unnecessary staff. Management of resources is a big headache for programmers. In Java and C#, this responsibility is left to the garbage collector. With standardized attribute (annotation in Java) support, the program can be made highly configurable. Both platforms has brought a large set of standardized classes to ease the development, deployment, and maintenance of software systems.
C++ is more widespread. But C# and Java are easier than C++. Java is open sourced and has implementation for virtually all platforms. C# has addressed GUI programming in managed environment better than Java but full implementation only on MS platforms.

For the hobby programmer, I recommend the following-
C# programming with Visual C# Expression Edition on Windows is fun.


References

  1. John B. Anderson, Rolf Johnnesson (2006): Understanding Information Transmission
  2. The Art of Assembly Language

Friday, November 06, 2009

Introduction


This post is part of my ongoing effort to familiarize the readers with the commonalities & variabilities of C++/Java/C# programming languages. A tentative outline of this series can be found at Contents.
C++, Java and C# are definitely three of the most widely used programming languages in the world. C++ extends the capabilities of C with object orientation capabilities. C is the ubiquitous language in system and low-level programming especially in the *nix system. Both Java and C# are derived from C++. Each of these languages has their strengths and weaknesses in regard of capability, flexibility, simplicity.




The main differences of theses languages actually lie in the compilation and execution processes. The compilation process is responsible for preparing the code to be runnable on the target platform. You have probably heard about "Managed Environment". C++ programs run in unmanaged runtime environment. The compiler and linker of C++ generate machine readable formats directly. The operating system is responsible for loading the executable program and run it.

Java and C# programs, on the other hand, run on managed environment. The programs are compiled into so called byte code. The Java or .Net runtime, called Virtual Machine (VM), loads the compiled file, interprets the byte code and converted to machine code by Just-In-Time (JIT) compiler, which in turn executed on top of the VM. The VM has complete control over the program and can monitor, investigate and occasionally manipulate it. It is possible to directly generate the machine readable byte code and thus escaping the JIT compilation step. But it is strictly discouraged.

Following activity diagram shows the C++ compilation, linking, and execution process. The compilation generates object codes, the linker creates runnable binary code (e.g. single .exe from multiple object files), and at execution time the operating system loads the file into memory and runs it.






The process has more steps in managed environment. The Java compiler generates byte code from the source. The byte code is human readable; there are even tools available to generate source classes from byte code. This is more or less similar with C# compilation. The generated code is called MS Intermediate Language (MSIL). All .Net aware compilers translate the source code into MSIL. The MSIL has common type system (CTS) for all languages. So at this level, program written in one language can be shared by other language, making .Net a programming language neutral framework.



Thursday, November 05, 2009

Psychology of TDD

I joined the software development team of a mid-size company in 2002. It was, and still is, the market leader in producing a special type of measurement devices. The team has a modest size consisting of some sophisticated engineers who have abundance of knowledge in their application domain along with the software development area. I was recruited as a component expert and started with refactoring and partly reengineering a mammoth code base which was 10 years of continuous development. The software was initiated with ad-hoc methods and features were added gradually over the years. Though this was an object oriented software programmed in C++ with MFC/COM/ActiveX, the principles of Object Orientation were rarely followed. But fortunately, like many other systems in the world, it was working and was the main cash cow of the company. The uniqueness of the product was the main ingredient of its extreme success in the measurement domain. It was the principal contributor to the substantial year-to-year growth of the company.

The software had a release cycle of eight months. The amazing fact was that it was never delivered in planned time. The system testing phase before the release was highly time consuming and alarmingly increasing with new releases. The development cost increased with the time. I started to investigate the root cause of the bottlenecks. I have compiled my findings in this post.

My colleagues were undoubtedly brilliant. But most of the programmers, they were very reluctant to testing. The unit tests were done only to check logic correctness. The integration was done early and often, but the integration tests were performed only as late as just before system test. There was no automated testing process. Most awkwardly, there was no full-time software quality assurance (SQA) position. It took a lot of time to convince the management that "Developers are unsuited for testing own code ()".

At that time I was using xUnit for my C++ code and JUnit for the unit testing. My high interest for test driven development (TDD) naturally drove me to quest for an automated test process. One of my major responsibilities was to design, code, maintain and extend the data access subsystem of the software. So I generated unit tests for the new features and wrote the code. The tests were automatically run at night in the build system. The attempt was an immediate success. My next step was to propagate my knowledge to the coworkers and influence them for TDD. It came out much harder than I thought initially.

It is all about human Psychology

TDD is to generate the test cases first before starting your first line of code. The test will fail as long as there exists no code. This is in direct conflict over classical approach of writing code and then testing the correctness of the code. Developers need a different mindset to adapt to this new paradigm. Once they get used to the system, programming is fun.

Why should a company change their usual process if it is working? Why should the engineers learn a new paradigm if they are experts in the respective application domain and their future is almost secured? This is obviously different from when a company unequivocally decides to apply some processes in its business strategy. I needed some diplomatic approach based on performance proofs. I took one simple product and instrumented some metrics to measure the work hours spent in a release. The same measurements were repeated in the next release now with test-driven approach. The outcomes of the two measurements were compared against the planned work hours. I was able to show that TDD saved 30% of our test phase. Our test process was consuming 25% of our release cycle. So 30% reduction in testing amounts to 7.5% savings of complete release. It a very convincing results for this simple application. We have later found when combined with frequent integration tests, the unit tests bring must better result.

TDD needs a different mindset. The epic book "The Psychology of Computer Programming" by Gerald M. Weinberg has encouraged me to closely investigate how my coworkers react and adapt themselves to this new paradigm. At the end, I found very interesting results.

  1. You Know Unit Testing:
    Implicitly or explicitly every programmer needs to do unit tests. It can be done running the program and manually checking it is doing what you wanted it to do. This should be done for every single unit of code you are writing.
    Check what you write is right
  2. You Know Test is Recurring:
    Even for a simple program, it is extremely difficult to exhaustively remember the dependency of a statement over others. Changing a single statement at some code point may break your program at some other point. If you believe you are God, you don't need extra care for dependencies. In that case you don't need to program either. The process of continuous testing of program for bugs is know as regression testing.
    You know regression testing
  3. You Understand Automatic Test Execution:
    So you are convinced of regression testing. It is a recurring process. If your company is ready to give out infinite amount of money only for testing and you enjoy testing more than than programming, you are welcome to do so manually. In that case you better ask your company to move you to the quality assurance team and concentrate on system testing.

    Your can't certainly afford ever increasing number of the tests manually. You have to find some automated mechanism. The classical approach is to use some interpreted language like batch, vbscript, bash etc. to write test scripts. If you love more control over test process continue reading.
    Automate your tests
  4. You Accept the Importance of Systematic Approach:
    For a simple program, script may serve you enough. But for complex piece of software you need some engineering approach. Otherwise, you might need unit testing again for your test scripts and this cyclic requirement cause you to fall into an infinite loop.

    There are various frameworks available for systematic unit testing in all major programming languages. The xUnit framework is the most widely used among them. The name is derived from widely successful JUnit for Java. There are currently many similar frameworks on other languages as well. I use CppUnit for C++ for quite a long time. NUnit is the first to come for the unit testing of .Net programs including C#. But Visual Studio 2008 integrated testing framework is also very good. A comprehensive list of unit testing frameworks can be found at http://en.wikipedia.org/wiki/List_of_unit_testing_frameworks
    Know appropriate unit testing frameworks
  5. Mocking is not mocking:
    As a component engineer, I have always problems to find the perfect granularity to fire up my tests. When a new component is designed, we define the interfaces first, then the abstract classes, finally we can create concrete classes. The interfaces and abstract classes are non creatable, so we can't instantiate them. We can't test them until we implement the concrete classes. The "Test Early, Test Often" principle of TDD can't be applied to non existed components. Let me give another scenario. You are designing some facade objects which are supposed to connect to some other systems. They are good candidates for mocking so that you don't always need a live connection to test your components. My most favorite candidates for mocking are graphical (GUI) elements. They are usually difficult to test in automatic unit testing process and need manual intervention. Mocking them enables unit testing without human interaction.
    Mock early, Mock often
In a separate post, I will explain how mock based unit testing can be easily set up.

Sunday, November 01, 2009

Preface



This post is part of my ongoing effort to familiarize the readers with the commonalities & variabilities of C++/Java/C# programming languages. A tentative outline of this series can be found at Contents.
I am a programming freak. I love to play with programming languages and don't mind to learn new one if it has potential to bring value to my team. My first program was is Pascal which I wrote as part of undergraduate course. Programming was an utterly complex task at that time, so I didn't like it. I learnt C language in close succession and fell in quick love. I started to enjoy programming and try new things out. I was convinced that C could solve any programming tasks in the universe. It is correctly termed as middle-level language due to its duel supports for higher level features and lower level capabilities (even inline assembly support).

My experience in professional life is quite different. Client requirements are more important than developer's interests. Software is developed at a long period of time with a vision to run it much longer period. The importance for maintainability and reusability drastically increases with the complexity of the task. I had to learn the principles of object orientation for that and C++ was my natural choice. Since then it is my most preferred language.

Things were not always smooth with C++ in the industry. It has abundance of features that cause very often confusion to the developers. Many features are immensely excessive than a developer can remember at particular point of time. If you decide OO for your project, why do you need C-style procedural programming? If you don't need to manipulate the hardware directly, why do you need assembly compilation at all? The direct resource management adds to the woe. Your employer will always try to task will less costs. Even if a project costs less, the next project will be tried to be completed with lesser costs. Visual Basic (VB) adds new capabilities to build programs in minutes. It makes the program development simpler and faster. So I had to try that out in my company. Unfortunately, when program becomes large and you need to do complex staff, then VB becomes almost uncontrollable and even impossible to implement lower level features.

Java has radically changed the landscape of software development since its introduction in 1995 by Sun Microsystems with the slogan ”Write Once, Run Everywhere”. Java is a pure Object Oriented Language derived from C++. Many ambiguities of C++ were removed. It has simplified memory management through automatic Garbage Collection (GC). The language is accompanied by a large collection of class libraries. Involvement of large software companies and presence of a strong developer community around the world result in a massive explosion of systems based on this technology. It has won the support of many enterprises around the world thanks to its superior security model and platform independence. Sun has released the Java Development Kit (JDK) as free software under GNU General Public License (GPL), and thus further increasing its acceptability to the enterprises and developer communities. There is a large stack of software frameworks from Sun, open source community, and commercial vendors for distributed web development. The development is driven by Java Platform Enterprise Edition (Java EE) which is built on a set of widely accepted technologies, specifications and standards for the distributed system development like Servlets, Java Server Pages (JSP), Java Server Faces (JSF), Enterprise Java Beans (EJB) which are further supplemented by several developer friendly and community driven frameworks e.g. Struts, Spring, Hibernate etc. The Apache Tomcat and Sun Java System Web Server are two free and open source web servers used very often for the web application deployment.

The penetration of Java language specially in the application server sector and its increased popularity have influenced Microsoft to develop its own suite of similar framework. Consequently, it released .Net Framework in 2001. Due to its muscle in the PC industry, it was immediately clear that the framework would push forward and take a strong place in the future. The .Net framework is a programming infrastructure to build, deploy, and run applications primarily on Microsoft Windows Operation systems. Similar to Java, programs which are targeted to .NET platform run on a managed environment, theoretically providing platform independence for the programs, once a .NET runtime is available for the target operating system. In addition, .NET Framework has cross-language support that provides a mechanism to share .NET components developed in different languages allowing collaborative development of a system by developers expertise in different programming languages. Though the .NET runtime and SDK are free, the costs of windows operating systems on which such programs are built and run make it less attractive for many programmers. Like Java, it is also derived from C++ with some restrictions and enhancements to the base language.

I have got a lot of opportunities to work on quite a number of software technologies thanks to the diversity of development fields that all my companies had. This enabled me worked with these three languages simultaneously. I like to share my experiences with the Internet communities that helped me over the years in experimenting new staff, problem solving, and coming up with ideas. It is often required for a computer student or programmer to read, understand and write programs in one or the others of these languages. As they are derived from a common root, the syntax and programming structures are very much similar. Programmer having some knowledge in one language is able to get into the other very easily. I have used the term commonalities to refer to the common aspects of the languages. Variabilities, in contrast, refer to the differences among the languages. For example, all the languages have int data type. int data type in C++ is platform dependent which is 2 bytes (16-bit) in length in 16-bit operating system and 4 bytes (32-bit) in 64-bit OS. On the other hand, int in C# is 8 bytes long.

Learning programming languages can never be accomplished without exercises. In this set of posts, a simple hexadecimal reader (Binary Reader) will be developed step-by-step in all these languages. In each section, language features belonging to some category (e.g. keywords, data types, statements) will be introduced. We will explain the usage difference of the language elements. Special elements for specific language will also be listed or explained. Some of the features relevant to our case study will be used in the code. Each language is further supplemented by some extra tools and conventions offered often by 3rd parties to ease the development, documents, build or deployment operations. We will also mention them in their proper places.


We will start with simple syntactical differences of these languages, and then move to Graphical User Interface (GUI), multi-threading programming, database development, web programming. If I can hold my energy, will dive into large scale complex enterprise development. We will not go deep into every technology, rather indicate the tools and technologies available for each type of development. In the next post, I will show how the compilation and runtime processes of these languages differ.