JavaScript SPA projects – 20 things to watch out –10 must haves, 10 PITFALLS to avoid

As I review large JavaScript implementations within our firm, I see the basic aspects missing, for various reasons.  Essentially if you are running a JS project that is trying to do an SPA implementation, please make sure you read these twenty things.  You might not need all, if the application is a very small one.  Many of must haves apply for non-JS projects as well.

NOTE: I have started linking various related articles via hyperlinks into this article. Hope it is of use to you.

MUST HAVES

1. Functional Document

Each program being executed, needs to have a clear functional requirement document that explains the use cases; broken up by functional components. Lots of time, UX design is done with very little understanding and that becomes the source of truth for developers to write code.  It will have zero to very small amount of validation requirements or data requirements. Just having the look and feel and page flow is not enough for a UI to be developed. Not sure, why this is hard to understand, but with crappy timeline plans, I have seen more than enough project managers ask the UI team to start developing and asking them to change, as the requirements become clear.

Functional requirements have to be clear and complete – for the UI developers to plan the components, services. Agile does not mean that you don’t need to have any requirement. One of the great misconception on Agile is to think, Agile = No need to have requirements detailed out. Agile is to enable the team to adapt to changing needs.

2. Architecture: SAD

System Architecture Document is a necessity – that should have a 4+1 view of the system being developed.  Such a document should address the security aspects (cross cutting concerns) of the JS implementation.  JSP/ASP (server side rendering) approach to security will not work for a JS SPA solution.  For example, user profile information in a browser is the last thing we would like to see (meaning, it is a NO NO to have a profile data in the browser).

With the controllers and view models all moving into the browser, it is necessary to have a layered architecture thought through for the JS code as well.

  • See 4 + 1 view for details
  • See ‘Patterns of Enterprise Application Architecture’ for details on layered architecture; also this wiki page provides some brief on layered architecture

3. System Currency Document & Standards

JS platforms are evolving in a rapid pace.  Most of them are open source and free to use, for enterprise as well. Not all of their releases are backward compatible. For example, Angular 2.0 (upcoming one) and EmberJS in the past are releases that consider(ed) features over staying backward compatible.  With such a changing environment, it should be a mandate for an Architect in a large JS program to establish for a formal or in-formal Architecture Review Board, that will approve and finalize the list of UI libraries.  All such finalized UI frameworks and any other platforms, third party products should be maintained with their version numbers.

Any development, SIT, UAT and Production setup should be done against this System Currency.  In addition, such a system currency should be used to evaluate the technical debt at a higher level from time to time.

4. Seed Project

JS frameworks are still evolving, unlike the Java/.Net space. If you take Angular JS there are various seed projects available that will setup basic pieces together – like – Authentication, Interceptors, project structure (to hold UI components, build files, test code, build files, module/sub-module loaders, Grunt or Gulp, test directories etc.,) out of the box. If you are starting a JS project and do not have a standard project (code) structure, then you are clearly letting the developers re-invent the wheel.

5. NFRs – from Day 0

Key NFRs for UI include performance, accessibility, offline access, compliance (might be regulatory compliance applicable for the domain), audit needs, longevity expected (like does this app needs to stay working for 10 years before the next upgrade discussion), internationalization, security needs and most of all usability expectation.  The requirements document should have outliers (sections in addition to functional details) that briefs about each of these.

Many a times UI development is half-way, when such discussions start; I have been part of such projects and discussions as well. Thinking through and raising these aspects before you start a program is quintessential.

6. Continuous Integration

For some reason, folks think CI is not applicable for JS development. Not sure why. Grunt tasks should be done, so every change done is tested via the Jasmine/Mocha/Protractor tests against the browsers listed in the requirements document.  Without a CI, finding browser issues in the last phase of testing, results in patch code and ugly fixes.

7. TDD (at the least write tests)

I am not going to preach TDD here, even though I love TDD and how if done properly, impacts the design of a system. Yes !! TDD enables you to do a good design !!!

Each JS developer should have Karma running, so each change done by him/her results in execution of Jasmine/Mocha tests immediately. There is no other system that could provide immediate feedback.

8. Role oriented team structure

Unlike the past, where UI development was one dimensional and the focus was only on developing the views – present day JS MV* based UI are either single or multi page applications, not web pages anymore. Hence there is an increased necessity to breakup and structure the UI developers based on the type of work. This enables them get better and avoid errors on what they do, also at an increased pace.  Check out this post which discusses the role based team structure, for a large scale JS Development.

9. One Pagers

Developers to Architects always balk when there is a need to write a large document; at least this is a prevalent issue. One pagers (actually half-a-pager) is brought into to make documentation easier. One pagers can be per screen or use case; essentially this is written by Architects who analyze that UI page by page, and write key aspects on the workings. This becomes the blueprint for the developers to follow.

A sample one pager for a screen should capture the following. Other aspects are across project standards – like coding standards, best practices (like avoiding rootscope etc.,)

One pager should contain the list of

  • Reusable UI Components to make use of
  • Security recommendations to follow
  • Events to handle and quote the samples
  • Backend services to integrate

10. Brown bag (OR) Over-Tea-Snack sessions

JS frameworks and related techniques are evolving in rapid pace (actually speaking, big data and even JDKs are moving ahead with tons of changes); hence it is necessary to inculcate the habit of constant learning for the team; else the project will depict a new stack in the design document, but the code will be from the past  (like our typical products in the market – same old thing but in a new cover).  To avoid this, brown bag sessions should be encouraged.

Brown bag sessions bring in few key things:

  • Informal Setting: One it is in an informal setting – so even a discussion where one is munching on their sandwiches is fine. In India this might be hard, so similar to brown bag, an over the tea-snack sessions will provide the setting
  • Lighter Preparation – Group Discussion: Unlike a formal presentation, one person can start on a topic and many others can contribute.

Essence of these sessions are to ensure the team captures onto the latest changes that are must-haves.  Typical offshore teams are very large, compared to the teams I have worked in the BayArea. In such times, break-up these into multiple sessions. A session should not have more than 10 or 15.

PIT FALLS TO AVOID

Common pit falls I have seen in various Angular JS programs, are written below, in no-particular order.

1. Objects, Objects – No hanging attributes

For some reason, developers stop seeing objects and move onto an attribute view, in JS. This might be because of the fact that in JSPs (For example) they were handed over the backend objects and they just need to pick attributes that are needed to be shown. Now that UI developers gets to write the controller, service invocation layer as well.  Frameworks like AngularJS has $scope; for developer who do not understand it properly – they see it as cloud to which they can throw anything and the view gets it.

For example, if there are 5 attributes required to be shown in UI from 2 different objects, the attributes to be attached into the scope should be namespaced – like  $scope.student.age, $scope.student.name, $scope.student.discipline than just $scope.age, $scope.name, $scope.discipline


$scope.balance = personfromservicecall.savings.balance; 
//avoid such attribute throwing over the wall (via $scope)

$scope.person.savingsacct.balance= personfromservicecall.saccount.balance;   
// ensure each UI field we handle or attach to the scope belongs to an object

Essentially start having your value models in place. Create value models as the last resort, if the views needs cannot be met by the models from the services.

2. Encapsulation

In AngularJS implementations many a times I see the JS promise returned by the REST service call being parsed on success or on error (see async invocation in AngularJS ) in the controllers. As the single responsibility principle in SOLID principles go, the responsibility of the service is to retrieve the object and NOT of the controller.  If a Java/.Net developers moves onto JS development, they are more used to the synchronous invocation style; hence they just return a promise to whichever controller calls that service. And every controller tries to parse the promise and build the object returned.

Rather if you have a function within the service (angular JS factory) that resolves the promise to the object you expect, it is done once – and not in every controller.

3. Sub module level loaders

For large JS implementation (anything spans like 40 or more screens), first priority is to identify the modules and sub-modules. Loading of UI should be at the sub-module level. The seed projects we build should have sub-module level loaders. There are two implications if we do not do this correctly; first one is the – performance and the next one is the security.  First one is the most obvious.  Not so obvious is the security aspect. Let us understand (or review) SPA from a source loading perspective and then the security aspect

  • JS Source in the browser can be seen by all
  • Majority of the applications have role based ACL needs.

Loading more than necessary screens, forces the UI developer to do entitlement checks** in the UI and show/hide the screen (or part of the screen) to the viewer. With the entire source being loaded to the browser, it takes no more than few minutes to debug and understand the checks done, based on entitlements.

Breaking up the modules to levels, according to the ACL is a necessity. Discussing security in SPA is a separate topic on its own. To leave you here, the only safe mechanism is to secure the REST (backend) services with appropriate ACLs is mandatory.

4. Code Comments

As lots of Java/JEE or similar developers moving to JS, their view thus for has been – “JS is a scripting stuff and only few things are done on it; so comments are not that necessary”. JS is not type safe like C# or Java. Hence without comments (and proper usage conventions defined as standards for a project) understanding bunch of JS files is next to unfeasible.  Provide comments at a folder level at the least.

5. No $rootscope

Do not pollute the global space in JS. Have name spaces that holds the variables. Similar to the global variables $rootscope in AngularJS is the top most scope defined for that application. Ensuring variables (ui fields) used are in proper namespace (like addressing person.savingsaccount.balance) makes it easier to understand and helps avoid un-warranted collisions (bugs thereby)

6. Entitlements in Browser

With source being loaded and readable in the browser, having the entitlements and checking them in the browser, enables the hacker to understand the internals of your system. Rather you can ask the backend to let the UI (SPA) application know, whether an action is permissible. Even while asking the session token that gets sent, should enable the backend system to determine the user and the role; based on which the access rights (entitlements) can be figured out.

7. Not using Karma

Writing Jasmine or Mocha tests are mandatory; but without running then across the channels of consumption (web or mobile browsers) makes it useless.  Running them via Jenkins/Grunt scripts is one option but Karma provides instantaneous feedback, by running the configured test cases against the configured browsers in real-time.  Not using Karma will impact the timeline, as the bugs will be found at a later point – either during hourly or overnight tests or at QA time.  With every minute the function or controller or service that is buggy, could be called by other layers – resulting in various other issues.  Debugging, resolution time spent on the unwanted ones, is a criminal waste (in my view).

8. jQuery based Event handling

Folks never seem to move from jQuery. Even after getting JS MV* frameworks like AngularJS, they continue to do UI event handling with jQuery. A double whammy here – one you are introducing unwanted depdency on jQuery and you are not understanding the features of AnguarJS for example. Avoid jQuery completely.

9. Controller bloat & DOM Manipulation

Services in AngularJS does the brunt work; and controllers are like managers – should make sure the events / exceptions get handled in the correct way and proper UI navigation take place. Beyond that they should NOT try doing more. If they stick to the above, they by defacto will stay smaller.

Folks typically complain that the number of UI fields I have is way too high (80 for example, in one of the projects I worked), there is no way, I can reduce the size of the controller.  For those, just go and check out the OOAD principle of encapsulation.  Controller do not need to know the details of the ValueModel.  They are there to pass it to the view; views place the appropriate attributes in appropriate places. With two-way data binding in AngularJS, there is no need to fetch the UI values and setting them up in the models.

Second thing is the DOM manipulation; avoid manipulating the UI. There are tons of articles that talk about UI event handling capability of AngularJS.

10. Local NPM

With so much change happening in the open source (particularly JS) frameworks, it justifies to have a local NPM that has your approved versions of the framework.  Each library or framework in local NPM should have gone through the license checks necessary; as we don’t want a developer casually adding a third party internet available library to his/her source which is not commercially free.

As I write this, there are more articles that seem necessary – one for sure, about the security of JS MV* applications.

JavaScript SPA projects – 20 things to watch out –10 must haves, 10 PITFALLS to avoid

JAVA 5 – 8: 2005 TO 2015 – GENERICS, COVARIANCE – PART I

Mocha

In the past 10 years, Java has moved on from Java 1.4 to 8. Major releases like JDK 5 (Tiger) to Lambda expressions now have brought in many a changes in Java.  This blog entry is to review the key changes brought in, in the last decade.

Rather than seeing what is new in each release (which was the original thought), key changes, concepts introduced are reviewed.  This is no way a full blown list of changes with example; every releases has done few changes to collections – not all are discussed here. For the entire list, please check ORACLEs – release documentation.

As I write code, to check discuss a feature, I have pushed that to the github repository given below.  https://github.com/ganeshkondal/jdk5-8

Note:
As I started writing, I realized the amount of content that needs to be covered, so I had to split the content into multiple blog entries.

Assumption
Key assumption is that – you (as a reader) has worked in Java and coming back to refresh your knowledge on the past releases.  If you are new to Java, you are better off with a Oracle tutorials and not this blog entry.

Generics
Generics first look meant, ‘no more typecasting’ to the developers. But it offers(ed) a lot more.  Generics is for tighter type checking at compile time. At run time, via type erasures all these type parameters are removed with appropriate casting.

Generic – Raw types, methods

If you want to define a container which can hold different sets of objects then, nothing beats defining a Container class that can add, remove, set, get T instances. Similar to the class definition, methods can define and work upon generic objects than a specific one

 public class Test<T> {
     private T item;

     public void setItem(T t) {
         this.item = t;
     }

     public T getItem() {
         return this.item;
     }
}

Fundamentals – Generics and Subtyping

Let us say class Circle extends class Point.  If there is a list declaration – one with Point and the other with Circle in generics ( List,  List), it does not mean that List of Circles inherits the List of Points.  This is a fundamental shift, which many wrongly assume.


class Circle extends Point{  ... }

List<Point> points = new ArrayList<Point>();

List<Circle> circles = new ArrayList<Circle>();

Wildcards

With the above understanding (on generics & subtyping), if you have an API that wants to iterate through a collection and do certain things how will you write

Now comes the wildcard (?) which represents anything (any object, if used within a collection). ‘?’ is the unknown that can hold anything that can be stored within a collection.

void doSomething(Collection<Object> objects){
	for( Object obj: objects ){
		// dosomething
        }
}

Within the ‘doSomething’ we can retrieve, but cannot add or update; trying an add or update will result in a compilation error.  This is done, because the compiler is not in a position to cross check the object being added against what the collection is supposed to have

Bounded

Having wildcard on APIs is not always desirable; as you want to limit the type of objects, the API needs to work with. Still the below API will result in compilation errors. Reason being that – any child class could have been stored in the List personObjects – not necessarily the instances of ‘Child’. Hence letting a an instance of class Child be added, will result in a inconsistency down the line, at run time. To avoid that, Java compiler stops you from adding an instance of the class child, even though it looks logical, reasonable with this portion of the code.

public void doSomething(List<? extends Person> personObjects){
    // trying to add
    personObjects.add( new Child(“New child object”));
    // above line will result in a compilation error
}

Generic Method – Type parameter

If wildcards or bounded wildcards are not going to help create a generic method that does something, how will you achieve it.  Below code is the way

public class Container<T>{ 
    // other applicable methods of the class are omitted
    public void doSomethingAndAdd(List anylist, Collection t){
        anylist.addAll( t );
    }
}

Legacy Integration – Catch

Legacy code which is very much out there in large enterprises even now (2015), does not use Generics. Call to a legacy API like the one below, can make use of Generics.

class LegacyClass { 
    public static Collection getPersons(){
       // returns a collection of Parent objects 
    }
} 
// Code that makes use of the LegacyClass API shown above
// client code that calls the above legacy code
Collection<Person> personList = LegacyClass.getPersons(); // line 1
Collection<Child> childList = LegacyClass.getPersons();   // line 2
 // above line will result in runtime error

Legacy code is being called with Generics way. Developer is given the authority to determine the appropriate object type.

Generic Type Variables and STATIC

JVM comes to know of the generic type as the instances are created.  Hence having a generic type as a static variable is oxymoronic.  For every type that needs to be accommodated in the generic type,

public class Container<T> {
    private static T item; // not allowed
}

// static instance is not going to allow usage like the ones below
// intent of generic type is to hold different types - which is not 
// feasible with a static reference
Container  tennisballs = new Container();
Container  cricketBats = new Container();

If generic type is allowed to be static, instances above (tennisballs, cricketbats) will not be feasible, as the static reference is going to be same across instances.  Such a constraint cause of static defeats the purpose of having a generic type, which is to hold different items.

Type Erasure

Re-iterating the intent of Generics here “..for tighter type checking at compile time”. With this goal, generic types will be erased at compile time.  Appropriate cast statements will be added. Wildcards will be replaced by Object class.

Covariance, Contravariant and Invariant

Covariance is the substitution of a child in place of a parent object. This is based on Liskovs substitution principle. Covariance can be achieved at various levels.

Type Conversion and Arrays

Type Conversion in Java is covariant. Arrays are equally covariant.  A parent (like java.lang.Object) array can hold the child (java.lang.String) objects.  Java is covariant in various aspects – during method invocation, instantiation/assignment, during method overrides etc.,

  • Parameters : If a method takes in a parent class; a child class that extends the parent can be sent
  • Overriding Methods & Return types: Child class can override the parent’s method and return a Child object.

public class Parent{
    String name = ‘’;

    Object getName(){
      return this.name;
    }
}

    public class Child extends Parent{
        String getName(){
            return name;
        }
    }

Contravariant

Java does not allow a parent to be sent in place of a child reference – either in method invocations or assignments. All such will result in compilation errors.

Invariant

Generic Types are invariant. Below assignment is not allowed.

List<Parent> persons = new ArrayList<Child>();

Summary

With Generics covered, the next posts will dive onto Annotations, Collection package changes over the years, performance improvements and to Lamdba Expressions.

JAVA 5 – 8: 2005 TO 2015 – GENERICS, COVARIANCE – PART I

IQ Tests

Was reading about Elon Musk yesterday. Very very impressed. One led to the other and I wanted to check my own damn IQ. So took this in the freeiq test site.

 

Free IQ Test
Free IQ Test

But taking a look at the score and the graph, I don’t believe in coming up with a score with just 20 questions. Any how, it is always great to see such a good score. 🙂

Knowing myself, online scores by various sites are definitely exaggerated; may be such sites are giving such high scores to make people feel good about themselves. Per various discussions in quora, mensas online IQ test is the closest one it seems.

IQ Tests

Large scale js development–team structuring

Introduction

JavaScript MV* framework based UI development is the way for most of the projects kick-started this year. I see the same happening for upcoming year as well. As we are starting a similar journey to develop hundreds of screens using Angular JS as our JS MV* framework – it has become very evident for us to focus on effectively structuring the team. How do we position ourselves so the developers get better at on what they do, and increase the productivity – is going to be key, to achieve tight deadlines.

Team up against a daunting timelines need to improvise, in terms of how they approach development.

Off the Block

First step is to identify tasks and group them; followed by it would be the structuring of the team with clear sequence and responsibility

  • Tasks & Group – The intent is to identify tasks that are similar in nature, so there will be repetition
  • Team Structure – Introduce a lifecycle and ensure the team is aware and practicing it
Task Groups

UI development using JS MV* framework can be grouped under the following buckets.

  • UX Design
    • Develop the UX design and Deliver the wireframes
  • HTML 5 / CSS 3 development
    • Delivers the HTML 5 and CSS 3 as per the wireframe designed
  • UX Components
    • Create the UI Component using AngularJS custom directives
  • MV* Code
    • Develop the router
    • Develop the controller
    • Develop the views, partials
  • Service Integration
    • Integrate the REST API backend (eg. via Restangular)
    • Develop the models necessary

Guiding Principles

Setup a framework on which the developers can rely on; and also let it provide the guideline how to evolve modules.

  • Framework – Set few folks who are your best to build the base seed project. They can either take up the angular-seed OR build one such that
    • Functional modularity is addressed in the base project
    • Builds are setup with grunt
    • Bower based dependency management is setup
    • Unit and End to end tests are setup using Jasmine/Karma
    • Ensure ‘flow’ (Facebooks’ open source contribution – for JS type checking), JSLint are setup to ensure code quality
    • Folder structure per your functional modules are identified
    • Overarching router is in place, so the functional modules can be made pluggable
  • Repetition
    • Structure the team so the members do repetitive tasks
    • Repetition will position the engineer get better at it; avoid redundant issues
    • In turn it will increase the productivity
  • Reusability
    • Componentize as much as appropriate
    • Avoid repeating code
    • Components are packaged separately with a release cycle of their own
  • Parallel Development
    • UI developers waiting for backend services to complete is a typical bottleneck
    • It needs to be broken down
    • See ‘mock service’ for details

Team Structure

image

Process

Similar to chain of responsibility pattern, teams below hand-off work in the sequence listed.

Folks who will own each task are identified as well.

Team

Role

Responsibility

Deliverable

1

UX Designers

Develop the wireframe based on their discussion with the product focus groups

Wireframes

2

HTML 5 / CSS 3 Developers

Responsible for designing and delivering the HTML 5 / CSS 3 screens based on the wireframes

HTML 5, CSS3

2

Leads / Architects

– Responsible to build the framework

– Responsible to identify the components within the screen, taken for development

– Responsible for identifying the contract and model (payload) expected from the REST service

– Identify the partial views and how they will be reused across screens

– Identify the event and exception handling necessary

– JS Framework Code

[ Custom Angular Seed Project ]

– One pager detailing the seed project and usage

3

Component Developers

– Only focus is to build components identified

– Ensure the components provide the functionality expected; across browsers

– In addition to component development, develop the SSO /security integration (adapters)/utilities necessary

– UI Components

– One pagers detailing the component

4

Integrators

– Develop the models (in MVC)

– Identify and integrate with the REST services in the backend that provide the entities to the UI

– Make use of the mock service than waiting for REST API completion

JS MV* code – that uses Restangular like REST API integration libraries.

5

JS MV* Developers

– Build the routers, controllers and views (reuse partials)

– Implement the events and exception handling

– Essentially build the view and controller in MVC

Application

Decouple – UI and Service Development Teams via Mock Service

Intent of the mock service is to enable the UI developer create a REST service immediately for any given URI with any expected payload

Mock service takes two parameters in a HTTP POST request – first one being the URL and the JSON payload it should respond on invocation.

As we put the structure, having a mock and decoupling UI teams dependency on REST is quintessential.

image

Summary

Unlike the past, where UI development was one dimensional and the focus was only on developing the views – present day JS MV* based UI are either single or multi page applications, not web pages anymore. Hence there is an increased necessity to breakup and structure the UI developers based on the type of work. This enables them get better and avoid errors on what they do, also at an increased pace. Above is a discussion on the same lines.

Large scale js development–team structuring

JVM – 32 bit vs. 64 bit & JIT Compilers

In a project, I am involved – the code base is fairly large and complex; comes to thousands of java/jsp files – millions of LoC.  As multiple teams are working – they have a build machine to build the code base and individual developers cannot build the entire code base in their weaker desktops/laptops.  Now the catch being the – operating system in the build machine is a 64 bit Linux OS with a 64 bit JDK/JVM; the developer machines as well as the production machine is a 32 bit windows machine.

Such are the times – your basic understanding of the system is questioned heavily.  Let me list the questions that comes to mind and answer those.

  1. Does 32 bit / 64 bit OS matter for a java code
  2. Can I build in a 64 bit JVM and use that in a 32 bit machine
  3. When I should be worried
  4. What is 64 bit in Java means?
  5. Any use in going to 64bit JVM?
Let’s begin with understanding the basics of Java/JVM. Then we will revisit these questions.

JVM Basics

Java Byte Code
  • Java byte code is bit-architecture / OS independent.
  • Above means, irrelevant of where you build the .class file – as long as you have the JVM (runtime) appropriate to the deployment machine architecture – you are good to go.
Note: Suggest you to read the entire thread, mainly the ‘curveball’ before leaving this entry 

64 bit Java 
  • Java SDK / JVM in 64bit is just another porting of the SDK and VM to support the larger address space
  • Above enables larger memory, larger number of threads.
  • Leaving Solaris – all other places you can directly install a 32 or a 64 bit JDK/JVM. Only in Solaris, you need to install 32 bit first and then move onto 64 bit JDK version.
  • JVM has two modes – server and client.  Only the server hotspot version is capable of running in 64bit; so that’s the default option.

API Perspective
  • There is no API in java, where you can give the distinction of the 32/64 bit option. So to a java developer it is another platform supported by JDK/JVM

Performance Perspective
  • As you use 64 bit, all native pointers within the JVM implemented – has pointers and the address space they point are 64 bit now; with the size increase comes the performance degradation.
  • Per oracle documentation, there is a 10-20% performance degradation as you move onto 64 bit JVM.
  • With 64bit there is no real limitation in the memory space; it means, GC (garbage collector) might have a large area to cleanup, resulting in a longer time. Another aspect that degrades performance.
Curve ball – native code – JNI 
  • Let’s assume you have some image libraries written in C that are 32 bit, these libraries need to be upgraded to work with the 64 bit OS/JVM you are moving towards.
    • On Solaris and other linux platforms, even C language has seen changes in the long datatype (increased to 64 bits).
    • Above being the case, any native code that relies on such data types will need rebuilding.
Now, with these learning lets revisit the questions initially listed, with their answers.
  1. Does 32 bit / 64 bit OS matter for a java code
    • No, not at all. As long as the java application you are building does not depend on native code (.dlls)
  2. Can I build in a 64 bit JVM and use that in a 32 bit machine
    • Sure; refer the above answer for the caveat.
    • Make sure when you deploy – you have the appropriate JVM. A 64 JVM in a 64 bit OS
  3. When I should be worried
    • Any native code that your application depends on. If not, you are well and good.
  4. What is 64 bit in Java means?
    • 64 bit is nothing but another port of the JDK, JVM – in Java. Java developers are not going to see any impact or code change because of this.
  5. Any use in going to 64bit JVM?
    • Yes, of course. The advantage is the enlarged memory space the application can address. The cost paid will be the performance.

JIT – Just-in-time Compiler
A related topic to this is JIT. I have seen many folks misunderstand JIT – cause of the “compiler” word being attached to it. Compilation in java creates the hardware, OS neutral bytecode; so JIT is misunderstood as another java to bytecode compiler or something along those lines.

What is JIT?
JIT is a feature of the JVM runtime. Once the byte code gets generated, JVM used to interpret each line of the bytecode and execute; this meant a big latency in execution when compared to the machine code being generated in C or C++.  To overcome this, JIT was introduced in Java Hotspot Virtual Machine. The objective was to improve the performance of the java runtime execution.

How JIT improves code execution
Instead of every bytecode instruction being interpreted, JIT compiler in the JVM compiles a portion of the bytecode into native code and executes.  Optimization algorithms that are part of the JVM helps identify the segments to compile.  Conversion to native instructions happens on a method level.

So any options within JIT
There are two different JITs – one for server and one for the client. In Sun (now oracle) Hotspot JVM the client option is the default; in jRockit server option is the default.
Just-in-time compilation is available in .Net as well as Java.

Java.exe –d32 and –d64
Java –help will show you –d32 and –d64 options; this is to tell the JVM whether it should run in 32 bit or 64 bit mode. In Solaris, in a 64 bit JVM, you can invoke either options.  In Windows if you just have a 32 bit JVM, and pass –d64 you will get an error message.

Summary
Byte code generated is OS/bit architecture agnostic. Compatibility issue comes with the usage of native code (dlls) within your java application. JIT is purely to enhance the runtime code execution speed of the JVM.  By default, JIT is enabled in the present day JVMs.

Moving to 64 bit JVM is done to access larger memory space, which is attained at the cost of performance.  Also as the individual storage space is bigger the same application that ran in a 32 bit will need a large heap size in a 64 bit JVM.

Relevant Links:

 

JVM – 32 bit vs. 64 bit & JIT Compilers

Sublime Text 2 – What not to like in it?

As lot of developers around the globe, I have also been in constant search for the best editor; an editor that will be compact, small (in terms of footprint) and one that will assist me in writing code – be it javascript, css or java for that matter.  I remember starting with Borlands’ JBuilder.  Then I tried out jEdit, Netbeans, Vim till Eclipse.  Eclipse has been my editor of choice till I touched Node JS code.  The nodeclipse plugin was not good enough; so I bought the WebStorm IDE personal license, which has it’s own advantages with JS development.

By the first month onto JS development, I have come across– sublime text from various blogs, tutorials many a times.

You can download sublime text from  : http://www.sublimetext.com/

Last week I decided to give it a try.  I did try and I was blown apart by the functionality, speed and various other aspects, which I have put out here.  If you are a Mac OS user, the key strokes will vary and lots of youtube tutorial on sublime text are Mac OS centric. You can check them out.  Folks from LevelUp Tuts, have a series of tutorials in youtube on Sublime Text, which I will recommend.

http://www.youtube.com/watch?v=k01udDD-UwI

In my view Sublime text is fantastic for the following reasons:

  1. Package Control and Insane list of packages / plugins
  2. Multiline select
  3. Vi keystroke support
  4. Lightning fast
  5. Super light – in terms of footprint
  6. Great linting facility and dot assists (for Java, JS, CSS…)
  7. Font change with Ctrl+
  8. Moving across files
  9. Moving within a file with a ‘Goto a symbol’
  10. Snippets

I am yet to explore the commands available within; it is for another day.  Let’s see some of the items given above to some detail.

Package Control and Insane list of packages / plugins

First and foremost install package control from wbond.  This is like the ‘AppStore’ for iOS ; ‘Google play’ for Android devices; ‘synaptics manager’ for Ubuntu.  This opens up tons of possibilities to install easily.  Go to http://wbond.net/sublime_packages/package_control to install the package manager.

Multiline Edit – Select All & Replace

Seeing ‘Multiline select and replace’ work is the ‘Aaha moment’ for sublime text for me.

Assuming you want to change all the ‘div’ in the HTML to ‘p’- Do the following:

  1. Keep the cursor in front of a ‘div’ and press Alt+F3.
  2. This selects all the divs (grey highlighting) and multiple cursors will be shown – one at the end of every word.  This means, all the DIVs are ready to be changed, with whatever you type.
  3. Type the new word to replace all the identified ‘div’s.

Image

Multiline Select – Random Select

  1. Instead of replacing all occurrences of a particular word, you can pick and choose the words you want to replace by ‘Ctrl+mouseclick’.
  2. Ctrl+MouseClick on the words you want to replace.  All of them will be selected.  As you start typing a word, all the identified words will be replaced at once. Just Fantastic !!!

Image

Moving across files  (equivalent to ‘Open Resource’ – Ctrl + Shift + R in Eclipse )

  1. You can open a folder and work. Files and folders opened can be viewed by Ctrl+k + b (view side bar)
  2. Ctrl + P helps you go to any file within the project you are working on.

Fuzzy Search

In Sublime text all the searches are fuzzy searches.  http://en.wikipedia.org/wiki/Approximate_string_matching will give you details on what fuzzy search means

Instant Preview

As you press Ctrl+P and start typing known portion of the file name; fuzzy logic based matching happens. As the matching happens, the file identified is previewed instantly.  That happens in milli or submilliseconds. Makes me want to see their code, that does this.

Package Installation

Ctrl+Shift+P – Opens up the command palette.  From there you can select ‘install package’, which opens up the list of available packages.

Themes

You can install themes from ‘Install Package’.  Nexus is a theme I like.   Nexus gives an Android based theme. There are some useful features – like saved files are underlined blue, whereas, yet to be saved files are underlined orange.

Image

Vi support

  1. Go to Preferences à Setting – User
  2. By removing ‘Vintage’ in the ignored packages list, you can enable vi keystrokes. Works like magic.

Moving within a file with a ‘Goto a ….

Very interesting feature; very useful for lengthier files.

  1. Go to a symbol  by typing ‘Ctrl + P @search-keyword-here
  2. Go to a word within a file by ‘Ctrl + P #search-keyword-here
  3. Go to a line within a file by  Ctrl + P :line-number-here

Image

Snippets

  1. You can create your code snippet.  This is there in every editor. In addition, you can state how your tabs should be placed in the generated code.
  2. $0, $1 shows the sequence of cursor placement on the generated code – on subsequent tabs.

Image

A fantastic editor – suited well for JS, CSS, and to an extent Java development.  I am saying “to an extent” for Java cause of the lack of debugger – which is very good in Eclipse. It might not be worth it to do Java development in this and doing standalone debugging.

Overall if you have not tried out sublime text, it is definitely (more than) worth trying it out.

Sublime Text 2 – What not to like in it?

Real time Application Usage via ThreadLocal

Recently I was working in a web based (Spring, JEE) web application. Like many a times in the past, the requirement was to understand the usage / get to know the usage live.  Unlike data warehousing requirements, the usage understanding is to ensure, the application behaves in an expected manner & being utilized in the way expected.  It was primarily happening through log analysis which is cumbersome.  In addition, the data store (db) was storing only the completed transactions & not the intermediary (or failure) ones.  Application behavior can be understood only if we see the entire transaction, against the user input.

Understand by user thread:

Understanding the usage happens, by viewing the applications response, across various logical points of that application.  To do so, you need to ensure you can tag / collate the applications’ decision points or states, per request.  Such collection of data points within a flow – needs some unique request level ID to tie all of them in one record.

Challenge in Hand:

No unique ID

With the need to understand the app usage; if we start inserting code across key phases (like external service invocation points; exception catch blocks etc.,) of the application – then you will need a unique ID to ensure all the phases for that request is collated in record or one group.  Typically such IDs might not exist on all the applications.

User Data (like ID or loginName) is not unique

To handle the above issue (of not having a unique transaction level ID), sometime the application events are suggested to collected (during the course of the application progress) against the user data like user ID or users’ login name etc.,  Such tries are flawed; as the user might try more than once.  Sometimes this is overcome by storing the data with a time stamp (or via DB sequence). But then we are assuming/depending on data persistence in a relational data store..

Not all user interaction are persisted

When a user request comes in, lots of application store only the successfully completed transactions or the ones that failed at the near endpoint, in the persistent store; which  helps to tie the received request  and the systems response.  But there will be many scenarios, where user tried to do a transaction and it fails mid-way through & that half-way data set is not stored in the persistence store.  Such usage scenarios is only known to the application owner via logs.  Storing all such data (successful, partial returns, error responses) might be too much noise and might not be preferred).

DB is not real-time

Storing in a data store and replicating and using a reporting system are useful but definitely not like seeing real time and understanding what is happening.

Third Party Tools

Tools/Apps around machine data parsing (e.g parsing application logs, web / app server logs etc.,)  are there; one of the popular ones is Splunk.  These are heavy weights which might not be an overkill for every web application in discussion.   Also one of the pain points (personally) with Splunk was its inability to tag multiple grep patterns and form a record. The above is a must, if your application does not tag / log any unique ID for the request being handled.

Solution: Thread Level Store & Flush to a Stat Service

The solution discussed makes use a thread level cache, which records the stats across the execution of a thread & flushed at the end (before a response is sent to the end user) to a local cache, which then can be displayed / queried via various HTTP interfaces.

Thread Local In Java

ThreadLocal (in Java) – is by the name – points to local variable within the scope of the Thread. Typically private static fields which have to be kept specific to the thread like a TransactionID or userID or requestID is kept in here.

Thread Level Cache

A simple class that holding a ThreadLocal object, and providing methods to set / add / get variables out of the ThreadLocal can make up the cache. See the sample code below.

ThreadLocalCacheCode

Thread Level Cache – Key Benefit

  • Solves No-ID scenario
    • As you are within the application**, having the ThreadLocal objects accessible via a wrapper class – you can store any attribute you want to store and rest assured that all the variables/values you are storing will be collated and stored for a particular request (thread).

Once you have stats taken across various phases, it can be stored  or stored temporarily via  (LRU type cache) with a HTTP end point (as simple as a servlet).

Sample Stat Service

Below code sample shows a simple StatService that enables storing name value pairs.  All these are stored in the ThreadLocal based caching.

StatService

Filter to flush ThreadLocal to Stat Service

Key in collating data via a ThreadLocal based cache is that we flush before the thread is reused.  Shown below is a servlet filter which lets the request flow through and on its way back, flushes the ThreadLocalCache.

Even though, there is a ‘StatService.flush()’ before the request leaves the web application, we still check the Cache for any reminiscent data at every entry point (line 38 shown below).  This is to handle any exceptional scenarios.

ServletFilter

Set Up

Now, lets see, in steps how can a existing app can make use of this.

  • Create a cache class that wraps ThreadLocal with simple static methods.
  • Create a class that can hold all the records from the ThreadLocalCache. the max count should be configurable.  The sample about uses a LinkedHashMap to store the records in chronological order.
  • Write a servlet or a REST service that will query the StatService in your JVM and report the data.

Statistics HTTP Service

Sample reporting servlet code is shown. All it does is takes the data cached in the StatService and returns in the servlet response in a readable fashion.  This is a very simplistic way of looking into what is happening inside your application at real time, instantaneously.

AGAIN !! Why all of this

If you have a user facing application, I bet, you will always be surprised to see all the ways it will be used (tried).  This is the same concept behind web analytics – which primarily assists in the users interaction with all the UI pages/flow.  Above said, way of seeing web application assists in ensuring the application / services behave in a way, we designed/developed it to behave.

NOTE:

** Such thread local type caches are applicable and useful in applications which are synchronous; rather applications which work in one flow – and not spin off new async threads.

Real time Application Usage via ThreadLocal

Map-Reduce

I have started digging / reading some seminal papers, on distributed systems – like the one discussed here – ‘MapReduce: Simplified Data Processing on Large Clusters’ by Sanjay Ghemawat and Jeffery Dean.

Google web search relies on the index built, using map reduce. Google started using this map reduce algorithm around 2003.

What is Map Reduce

Companies like google, twitter, facebook have tons of data to work upon.  How do you process data that is so large (in petabytes), within a short period.  You simply Divide-and-conquer(in parallel). That is the essence of Map Reduce. Divide the data set into pieces (see Structured vs Unstructured data) and run your code in parallel, across a large cluster of commodity computers (like a 2 CPU 160 GB HDD, 4GB RAM machines).

Why Map Reduce

Without such a divide-and-conquer type algorithm, we will be processing them sequentially and in one block, which will take forever (if you take a Petabyte like size of data).

Pre-requisites and Benefits.

Data has to be partition-able across different systems to begin with.

In addition to the scalability aspect; Map-Reduce implementation is provided as a separate library (or within IMDB/NoSQL products), so the developer can only focus on the operation (what ever he/she is trying to do – for example, if the developers intent is to grep for a string pattern across the webpages, to create an index, their code should focus only writing the code to search in a given subset of data). The complexity of making the search code work across a large cluster of machines holding the data set; parallelization, fault-tolerant, locality optimization is un-necessary. The distributed system complexity is taken up by the map-reduce library.

Graphical Representation [Jeffery Dean, Sanjay Ghemawat, 2004]

MapReduce

Key things from the above figure:

  • Split
    • Slice of input data that the workers will work upon.
  • Master
    • Top level program that manages all the map / reduce jobs among workers.  Master utilizes the existing set of workers and assigns map jobs to some of them and reduce jobs to some.  Also it keeps track of the job execution, so on failures (hardware or software), they can re-issue the same jobs to some other worker who is free.
    • If the Master itself goes down, the client will resubmit this mapreduce job again, to be done.
  • Worker(s)
    • Worker application that can take up and execute a map or reduce type job.
  • Intermediary
    • Intermediary output from a map job.
  • Output
    • Output after the reduce job.

All of the above will make sense, once you see the example job below.

Sample Map-reduce Jobs

Some sample operations along with their map/reduce functions are given, for easier understanding.

Sample MR Job 1:

Search for a keyword pile of text

The data in which the map-reduce based search (above) has to be run will be partitioned and stored in-memory (or in a file) across various machines.  Code will be written for map and reduce function.  The map function will return the lines that matches the pattern; input parameter will be the block of data in which it should look for the search pattern.   The output from the map function will be written to file or stored in memory. The reduce function in this case, will collate all the output given into a single output file or in-memory location.

Sample MR Job 2:

The role of the reduce function differs based on the operation in hand. If we take an example scenario –

count number of visits by a user and the input data set is the logs for the past 10 years

then the map will look for the users visit in the logs/input data supplied and the reduce will ensure, it collates and sums up the counts received from various map jobs.

Map-Reduce In Reality

In-memory grid / NoSQL solutions like WebSphere eXtreme Scale, Oracle Coherence, JBoss Infinispan  – implement this Map-Reduce internally and hide all the complexity of – fail-over, parallelization, partitioning logic from the developers.  So nowadays, (assuming you have such a IMDG or NoSQL product), you can achieve this MapReduce with just writing your logic (for map & reduce) and deploying them.

Map-Reduce