A few “PAAS”ing thoughts

November 24, 2010 9 comments

Google App Engine has suddenly been thrown into spotlight with a critical piece written by Carlos Ble at his blog.

I got my share of mixed feelings while reading his post. Why, you may ask? I have been spending a better part of my free time digging into the Google App Engine, which is a PaaS platform that allows you to write web applications in either Java or Python. Heck, I have even combined all my findings and published them as a free eBook over here.

Should I get all defensive about Google App Engine and try and blast each of Carlos’ points. Actually not. I do not work for Google and neither does my company currently do any work in Google App Engine. All my investigations into Google App Engine has been purely out of my interest and my goal of discovering a good platform to help showcase some of my applications.

What Carlos has mentioned in his blog post should not be construed as a way to get more publicity or the fact that Google App Engine is completely useless. If a platform or product needs to be successful, it needs to have its healthy dose of people who love it, hate it, just about like it, just about manage to work with it and many more feelings in between. Google is not a stupid company, its engineers are smart too and anyone can easily conclude that they know much more about the shortcomings of their App Engine platform than Carlos or its users, like you and me.

If you have the time to go through Carlos’ post and the colorful comments that follow, one thing struck me in particular. The concept of Cloud Computing is not being bashed at all in totality. This model of computing is real and no one is debating that. Given that, it is important to look at the 3 flavors of Cloud Computing: IaaS, PaaS and SaaS. We will keep SaaS aside for the moment. IaaS and PaaS are clearly defined. You need computing resources (Storage, CPU, etc) and complete control to run whatever you want via your own application stack, go with IaaS. Amazon does a fantastic job over there. To anyone dealing with PaaS, it is clear that all vendors in that space mandate a programming model. There are no Ifs and Buts about it.

Let us dig into that a little. By mandating a Programming Model, the PaaS vendors are clearly telling you that your application needs to be written as per their software stack and the APIs that it makes available to you. You cannot just think about taking  your inhouse web application and deploying it on Google App Engine and think that it will work. No. Never. Unless it is a Servlet, doing “Hello World”. Every PaaS vendor addresses infrastructure by wrapping APIs around it. They give you APIs for Authentication, Data Storage, Caching, Messaging, Networking and much more. What does it mean? It means you have to retrofit your application to use those APIs to get maximum mileage out of the PaaS platform. This is because those APIs are closely tied into how the PaaS will provide you elasticity. Let us look at Database support. Since the beginning of time, Google App Engine does not support SQL database storage (I am not talking about its Enterprise edition that has MySQL support). This means that your data storage mechanism has to be written (or rewritten) to fit into the NoSQL like service that the Google Datastore provides. Does it have restrictions? Is there are paradigm shift in thinking to move to that? The simple answer is “Yes”. But which software in this world does not have restrictions? If it was that simple, the world would have needed only one programming language. Anyways, that’s not the point. The point is that you have to adjust as you move to a PaaS platform. PaaS platform mandate a programming model and you need to follow that. As you go through that, you need to modify your thinking first and then your code and measure, measure and measure constantly what is working and what is not? And keep adapting as per that. All companies do that with their products, especially public facing live applications and this is not different.

Will all applications fit within Google PaaS? You don’t need me to tell you that they will not. What is it particularly that I like about Google App Engine. For me, its biggest selling point has been the low cost to entry (don’t mistake that for zero cost of entry!). I have developed several Proof of Concept applications that demonstrate some idea in my head and hosted them all on Google App Engine. Some of the applications are still live and running today. I have a good feeling what works and what doesn’t and have adapted my applications. I find it works best for applications that you have the luxury of writing from scratch. I would be careful to simply assume that my enterprise apps can be hosted tomorrow as-is on its platform, for reasons described above.

I have talked to a lot of folks about Google App Engine. And I intentionally mention to them things like the 30-second limit for requests and the 1000-record limitation. I think when you mention this to any software engineer, the first reaction is akin to an electric shock and then the outpouring starts. Without even thinking about what application they are going to write, they claim this platform is not for them. Here are some of the remarks that I have heard from them in response to the above limitations : How could Google do something like this? A 30 second limit for requests? Do you know my application will need much more than that? What no SQL? What are you talking? Have you lost your mind? Is there even a thing like that? What happened to your education and those C.J.Date RDBMS primers? :-)

All the above concerns are valid but my only response is that in a PaaS, it is a give and take relationship. You lose your full control over how you write things. In return you get the scalability for a fraction of the cost. You get your IT infrastructure managed without you even understanding it. Once you understand this symbiotic relationship, which is true in every other aspect of our life too, things settle down and then it does not look “cloudy” anymore (no pun intended!)

One more point. There are enough success stories for both Amazon (IaaS) and Google (PaaS). Ask any of them about their journey and there will be stories of pain, adjustments and finally making it work. There will be companies now, who will post their success stories with Google App Engine too to mitigate the bad press. But I personally don’t worry about that. What I am excited about is that this light started by Carlos, will result in more information coming out about Google App Engine and that includes stories from the trenches. I will be keeping my eyes open to scan resources that will spring up and learning more about the platform. Big Advantage to Google App Engine now, if they embrace the open nature of the comments and actually take up some of the limitations, that even they know about, on a war footing!

Finally I would like to conclude, that there is no point in telling people that you need to read the documentation and understand the platform completely before committing your resources to using the particular product or platform. This model does not scale anymore. It does not get you to the market faster anymore. And it goes very much against the fundamental notion of getting the Early Mover Advantage. Think about how many articles in the recent 1-2 years, that you have come across from even giants like Twitter and Facebook, publicly stating that technology X or Y did not work or scale for them, and hence they had to develop that infrastructure themselves or replace it with something better suited to them.  Are we expecting a Google or a Amazon to tell you, “Hey! You know what? Our platform really sucks if you want to do A,B,C.” They can give some general guidelines and best practices but everything else has to be discovered, shared and that is all Carlos is trying to say here. There will be only one winner in the end, the PaaS platform itself.

Long Live Cloud Computing! Long Live IaaS and PaaS!

Categories: Uncategorized

Architectural Considerations to move your product to GAE

July 29, 2010 Leave a comment

I wrote a blog post at my company’s site titled “Architecture considerations for software products on cloud”. Though the article is general in nature, I do feel that a lot of the points apply if you are planning to move your product to GAE.

Click here for the blog post.

Categories: Uncategorized

Project Management Principles via FIFA World Cup 2010

July 5, 2010 2 comments

This is slightly off topic and is present on my other blog. I present my summary of good Project Management vis-a-vis the FIFA World Cup 2010.

Link to Post :

Categories: Uncategorized

Public API Design Factors

If you have developed a successful application/service on Google App Engine, chances are that you are thinking of exposing your application/data for use by the public. That is excellent news since it is likely to allow mashup developers to integrate your application into an overall larger scheme of things. This blog post goes into what factors you need to address before you begin on your journey to create a public API.

The web has moved from its image of hosting web applications to that of a platform. A platform is primarily, a set of building blocks that you can combine in innovative ways to build your own application. One of the key factors that have driven the web towards the “platform” avatar has been the emergence of a large number of Public APIs based on Open Standards. A Public API in simple terms is exposing key pieces of your application for the consumption of other applications.

A Public API is an important piece of any web application today. Over the last 5 years, public APIs have proliferated and everyone from the big companies to the newest startup bets big on a public API. According to the premier API tracking site, The Programmable Web (http://www.programmableweb.com), there are currently 2033 APIs available for applications today [dated 21st June 2010]

What can we do with the Public APIs? Here are some of them:

  1. You can integrate a best of class functionality that is already being exposed by the Public API. For e.g. if you need to integrate Maps, simply integrate the Google Maps API. You do not need to reinvent the wheel.
  2. You can provide a complete different user experience (interface) to an existing service.
  3. You can utilize the Public API in a larger mashup that you are creating.
  4. You can create applications running on devices that the current provider of the Public API does not address. For e.g. you could write mobile applications running on the iPhone, Android, etc – which the Public API provider does not even provide.

Now that we have discussed some of the points about what one can do with a Public API, here are some points about potential benefits that a public API brings to your existing application.

A Public API gives access to your platform to a huge number of programmers/applications.

  1. Using open standards, they are able to build out applications using your API in ways that even the creators of the API could not have envisioned. And since the public APIs are usually built on Open Standards, they are integrated into a wide range of applications that run on hardware ranging from desktops to mobiles.
  2. In the current market scenario, programmers do not want to build something that is already exposed by the publicly available API and if the API is backed by a scalable and high available environment, they will adopt it.

This article now builds on the premise that you are convinced that a public API will help your already available service. The article now discusses important points that you must consider before you embark on coding out the public API. Each of these points focuses on a theme rather than the implementation details and they are listed in no particular order of importance.

REST v/s Web Services (SOAP)

These are the two predominant models of invoking your API: REST and Web Service. REST is being preferred by most applications today, primarily because of its simplicity and implicit mapping to operations like GET, PUT, POST, etc. It is fair enough to say at this point that you should expose your services via the REST way, even if you have a Web Services (SOAP) mechanism of exposing your services.

Response Data Format (XML, JSON)

Every API call results in data being returned to the calling application. The data could be a result contains data records or in case of update APIs, simply a response text indicating “success” or “failure”. Two formats are predominant here: XML and JSON format. Your API should be flexible that it will allow the caller to specify which format they would like the data response to be sent back. Several APIs of late have been adopting only JSON as the supported data format. The best option over here at this point in time should be to support both if possible.

Service Contract Description

A Service Contract of your API is extremely important. It clearly defines what features your public API provides. This includes:

  1. The different Services and their names
  2. How they are accessible
  3. Authentication mechanisms
  4. Method signature for each method in a Service
  5. Description of each parameter in a method

It is important that you take small steps in the initial release of your API. Do not expose all functionality at once. A preferred approach might be to expose a read-only API that first gives users access to your data and then introduce write-methods that allow users to save their information. Introduce newer methods/functionalities in later versions but release a fairly compact Service Contract in your initial versions. It will help you identify the chinks early enough and help to refactor the existing Service Contact and introduce newer ones.

API Authentication

One need not stress enough about how important it is to make your public API secure. You only want authenticated users to access your public API. There are various strategies and authentication/authorization standards emerging for you to begin with. The most common approach is to issue a API Key for any application that wants to use your public API. This API Key is provided on signing up and is to be provided with every request to your API by the calling application. You can then authenticate the key and also determine if the call is within acceptable Rate Limits (this is covered in a point later) that you have set per API Key.

An authentication standard OAuth is gaining widespread acceptance as a preferred way to allow users to first validate themselves with the Service Providers and then pass a token to the calling application that they have been validated. This removes the need for the calling application to store your users’ username/password with them. A large number of popular public APIs like Twitter have adopted this standard and you should have it on your radar.

Service Versioning

As your service evolves, the public API could also change to match the new features. You need to make sure that you do not break existing clients who have already binded to your existing public API. Enter versioning! Think about versioning your public API right from the first release. If you are using a REST like mechanism that inserting a version number like 1 or v1 might be a good start. With each version, specify clearly what is different, what the additional features are, etc.

Rate Limits

All public APIs should be free to exercise to increase their adoption in the initial phase. Unless you strongly feel that you need to have a paid public API right from Day 1, you should definitely make the public API free to use. However, it is important for it to make business sense at the end of the day. After all, computing resources do cost. An approach typically used by almost all companies is to build in Rate Limits (or Quota Limits) into their public API. Some examples include:

  1. 10,000 calls per day
  2. 1000 calls per hour
  3. 1 GB Free total disk space (This is required in scenarios where you also persist the objects in your cloud as part of the public API)

The rate limits are typically reset at regular intervals like an hour, daily, etc. In certain cases, where you also provide storage space, the disk space quota is fixed at an upper limit and is not typically reset.

No matter what your functionality is, defining the rate limits is important because you will need to build checks in your public API infrastructure.

At the same time, there is a clear possibility that if your application and consequently your public API, is huge successful, then there will be applications which will easily go past your rate limits. That is a good problem to solve. In those scenarios (and you should expect them), you need to think about any of two approaches:

  1. Selectively increase the rate limits for a particular application. You will first need to discuss the requirements with the original creators of the application.
  2. Provide a paid option, where the application buys the increased limit either in tiers that you have set up or on a pay-as-you-go service. The pay-as-you-go service is generally preferred because you may not get the high spikes every single day.

Documentation

It is important to support your public API with strong documentation. Pay attention to every aspect of the documentation. At the minimum, you must document the following clearly:

  1. End Points for different Architectural Styles (REST,WS) along with versioning.
  2. Define all services, methods (actions) and their parameters
  3. Provide sample request / response data formats for each method(action) in your service
  4. Document the error scenarios clearly along with Server side status codes

The best way to get started in documentation is to look at the documentation of existing public APIs. Pick one that suits you and stick to it. Several public APIs even have a RSS Feed about their API documentation to inform about any changes.

Helper Libraries

It is important to envelop your REST/Web Service calls into easy to use helper libraries. Helper Libraries target a number of client programming languages/platforms like Java, Python, Ruby, Javascript, .NET, etc. A Helper library gives a quick starting point to get someone exercising your public API within minutes or hours. A Helper library should address the following (recommended):

  1. Envelope the security calls (Authorization)
  2. Envelope the REST/WS calls and Data Format (XML, JSON) parsing.
  3. Optionally, return the results in Classes defined in the target high level language. For e.g. Java classes, etc.

Dedicated Public API Web page/site

It is strongly recommended to have a companion web site for your Public API. This companion website can serve as a one stop destination for all documentation related to your public API. It can also provide information like:

  1. Signing up for the API and getting issued an API Key
  2. Showcase applications / case studies of how people have used  your public API
  3. List official Helper libraries and contributed Helper libraries that make it easier to use your API.
  4. An API forum where users discuss your API and which your API support engineers can regularly monitor
  5. RSS Feeds that allow people to subscribe to API updates. Providing a “Sign up” for email updates to notify users is also recommended.

I hope this article gives a broad framework about the high level factors that you can consider your Public API release to support. Designing and Development can then commence much more smoothly. There are many other small details to consider, so please share your additional points in the comments for the benefit of all.

References

  1. ProgrammableWeb (http://www.programmableweb.com)
  2. Open APIs: State of the Market, May 2010. John Musser (http://www.slideshare.net/jmusser/pw-glue-conmay2010)
  3. OAuth Protocol Site : http://oauth.net/
Categories: Uncategorized Tags: , , , ,

Google Wave Programming Articles Update

May 31, 2010 1 comment

There were 2 articles posted in Google App Engine experiments blog.

1. Episode 7: Writing your First Google Wave Robot (https://gaejexperiments.wordpress.com/2009/11/04/episode-7-writing-your-first-google-wave-robot/)

2. Episode 11: Develop Simple Google Wave Robots using the WadRobotFramework (https://gaejexperiments.wordpress.com/2009/12/03/episode-11-develop-simple-google-wave-robots-using-the-wadrobotframework/)

I have updated both of these articles. The reasons for updating them are:

1. Google released a newer version of the Robots API : Version 2.0. They have recommended that all Robots written with the earlier API (which the articles covered) should be ported to use the new Version 2.0 API. The older API will be deprecated by end June.

2. WadRobotFramework (http://code.google.com/p/wadrobotframework) has also undergone an update to use the newer version 2.0 of the Robot API internally. Along with that, WadRobotFramework has also introduced several new features including a generator (so you almost end up writing as few lines as possible, the concept of obedience, etc.).

You will find both these articles updated at my new Blog (Google Wave Experiments) that is focused on Google Wave programming. Please find the articles over here:

Episode 1 : Writing your First Google Wave Robot using Robot API v2

Episode 2 : Writing a Wave Robot using WadRobotFramework

Hope you like the articles. I am looking forward to your feedback and want to know what you would like to see covered vis-a-vis Google Wave Programming so that I can conduct my experiments accordingly.

Categories: Uncategorized Tags: ,

Episode 16 : Using the Datastore API

March 17, 2010 8 comments

Welcome to Episode 16. In this episode we shall cover basic usage of the Datastore API in Google App Engine. The Datastore API is used to persist and retrieve data. Google App Engine uses the BigTable database as its underlying datastore and provides abstraction for using it via the Datastore API. There are currently two options available to developers i.e. via the Java Data Objects (JDO) and Java Persistence Architecture (JPA) APIs.

In this episode, we shall cover the following items:

  • Persist a simple record to the datastore using JDO. The intention is to limit it to a single record in this episode and not address relationships and complex structures. That could be the basis of future episodes.
  • Retrieve the persisted records by firing some queries. In the process, we shall see how to create parameterized queries and execute them.
  • Discuss some nuances about indexes and what steps you need to do to make sure that the same application that you use locally will work fine when deployed to the Google App Engine cloud.

The underlying theme will not be to present a comprehensive tutorial about the Datastore API. There are some excellent references available for that. The official documentation and well as the GAE Persistence Blog. The focus will be on getting up and running with the Datastore API ASAP and seeing it work when deployed to the cloud.

What we shall build

In this episode, we shall build the following:

  1. Create a simple Object that we shall persist to the underlying datastore. The Object will be a Health Report and will have about 4-5 attributes that we would like to save.
  2. Write the Save method that takes an instance of the above Health Report Record and persists it using the JDO API.
  3. Write a Search method that will query for several Health Reports using several filter parameters.
  4. Look at the datastore_indexes.xml file that is required when you deploy the application to the cloud.

Please note that the focus will be on the server side and not on building a pretty GUI. All server side actions will be invoked via a REST like request (HTTP GET) — so that we can test the functionality in the browser itself.

Developing our Application

The first thing to do is to create a New Google Web Application Project. Follow these steps:

1. Either click on File –> New –> Other or press Ctrl-N to create a new project. Select Google and then Web Application project. Alternately you could also click on the New Web Application Project Toolbar icon as part of the Google Eclipse plugin.
2. In the New Web Application Project dialog, deselect the Use Google Web Toolkit and give a name to your project. I have named mine GAEJExperiments. I suggest you go with the same name so that things are consistent with the rest of the article, but I leave that to you. In case you are following the series, you could simply use the same project and skip all these steps altogether. You can go straight to the Servlet Development section.
3. Click on Finish.

This will generate the project and also create a sample Hello World Servlet for you. But we will be writing our own Servlet.

Few things to note first:

Quite a few things are enabled for you by default as far as the database support is concerned. They are as follows:

a. Several JAR files are added to the CLASSPATH by default. Take a look and you will see several JARs *jpa*.jar, *datanucleus*.jar, etc.
b. In the src/META-INF folder, you will find a jdoconfig.xml file. There is a default Persistence Manager Factory class in that that we shall be using in the rest of the article. For the purposes of this article we do not need to do anything to this file.
c. GAEJ uses the DataNucleus library to abstract the BigTable store. The DataNucleaus library provides the JDO and JPA interfaces so that you do not have to deal with the underlying low level API. You will also find a logging.properties file present in war/WEB-INF folder. You will find several log levels mentioned for the DataNucleus classes. You can tweak them to lower levels like DEBUG/INFO to see more debug level statements of what happens when you are using these APIs. I have found it very helpful to set the debug levels to DEBUG/INFO especially when facing a problem.

PMF.java

The first class that we shall write is a simple utility class that shall get us the underlying Persistence Manager factory instance. This class is important since all other methods like saving a record, querying records, etc will work on the instance of the PersistenceManagerFactory.

The code is shown below and wherever we need an instance of the class, we shall simply invoke the get() method below:


package com.gaejexperiments.db;

import javax.jdo.JDOHelper;
import javax.jdo.PersistenceManagerFactory;

public final class PMF {
private static final PersistenceManagerFactory pmfInstance =
JDOHelper.getPersistenceManagerFactory("transactions-optional");

private PMF() {}

public static PersistenceManagerFactory get() {
return pmfInstance;
}
}

HealthReport.java

Next is the Health Report. As mentioned in the beginning of the article, we shall be saving a Health Report record. The Health Report will have 4 attributes and they are explained below:

  1. Key : This is a unique key that is used to persist/identify the record in the datastore. We shall leave its implementation/generation to the Google App Engine implementation.
  2. PinCode : This is similar to a ZipCode. This is a string that shall contain the value of the zipcode (Area) where the Health Incident has occured.
  3. HealthIncident: This is a string value that contains the health incident name. For e.g. Flu, Cough, Cold, etc. In this simple application — we shall be concerned only with 3 health incidents i.e. Flu, Cough and Cold.
  4. Status : This is a string value that specifies if the record is ACTIVE or INACTIVE. Only records with ACTIVE status shall be used in determining any statistics / data reports. We shall set this value to ACTIVE at the time of saving the record.
  5. ReportDateTime : This is a  Date field that shall contain the date/time that the record was created.

Shown below is the listing for the HealthReport.java class. In addition to the above attributes and getter/setter methods for them, note the following additions to make sure that your class can be persisted using JDO.

1. We need to have a constructor that contains all the fields except for the Key field.
2. All fields that need to be persisted are annotated with the @Persistent annotation.
3. The class is declared as being persistable via the @PersistenceCapable annotation and we are leaving the identity to the Application.
4. The Primary Key field i.e. Key is declared via the @PrimaryKey annotation and we are using an available Generator for the ID instead of rolling our own.

package com.gaejexperiments.db;

import java.util.Date;
import com.google.appengine.api.datastore.Key;

import javax.jdo.annotations.IdGeneratorStrategy;
import javax.jdo.annotations.IdentityType;
import javax.jdo.annotations.PersistenceCapable;
import javax.jdo.annotations.Persistent;
import javax.jdo.annotations.PrimaryKey;

@PersistenceCapable(identityType = IdentityType.APPLICATION)
public class HealthReport {
@PrimaryKey
@Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
private Key key;

private String pinCode;
@Persistent
private String healthIncident;
@Persistent
private String status;
@Persistent
private Date reportDateTime;

public HealthReport(String pinCode, String healthIncident,String status, Date reportDateTime) {
super();
this.pinCode = pinCode;
this.healthIncident = healthIncident;
this.status = status;
this.reportDateTime = reportDateTime;
}

public Key getKey() {
return key;
}

public void setKey(Key key) {
this.key = key;
}

public String getPinCode() {
return pinCode;
}

public void setPinCode(String pinCode) {
this.pinCode = pinCode;
}

public String getHealthIncident() {
return healthIncident;
}

public void setHealthIncident(String healthIncident) {
this.healthIncident = healthIncident;
}

public String getStatus() {
return status;
}

public void setStatus(String status) {
this.status = status;
}

public Date getReportDateTime() {
return reportDateTime;
}

public void setReportDateTime(Date reportDateTime) {
this.reportDateTime = reportDateTime;
}

}

PostHealthIncidentServlet.java

We shall now look at how to persist the above Health Record. Since we are not going to build a UI for it, we shall simply invoke a servlet (HTTP GET) with the required parameters. It would almost be like a FORM submitting these values to the Servlet. Before we write this Servlet code, let us look at how we will invoke it. Given below is a screenshot of the browser where I punch in the URL : http://localhost:8888/posthealthincident?healthincident=Flu&pincode=400101


As you can see, I am running the application on my local development server and invoke the servlet (which we shall see in a while) providing two key parameters healthincident and pincode. These two parameters are two key fields of the HealthReport class that we saw above. The other fields like ReportDateTime and Status are determined automatically by the application. Similarly the Key value of the record in the underlying datastore will be generated by App Engine infrastructure itself.

Let us now look at the PostHealthIncidentServlet.java code shown below:


package com.gaejexperiments.db;

import java.io.IOException;
import java.util.Date;
import java.util.logging.Logger;

import javax.servlet.ServletException;
import javax.servlet.http.*;
@SuppressWarnings("serial")
public class PostHealthIncidentServlet extends HttpServlet {
public static final Logger _logger = Logger.getLogger(PostHealthIncidentServlet.class.getName());

@Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException {
doPost(req, resp);
}

public void doPost(HttpServletRequest req, HttpServletResponse resp)
throws IOException {
resp.setContentType("text/plain");
String strResponse = "";
String strHealthIncident = "";
String strPinCode = "";
try {

//DO ALL YOUR REQUIRED VALIDATIONS HERE AND THROW EXCEPTION IF NEEDED

strHealthIncident = (String)req.getParameter("healthincident");
strPinCode = (String)req.getParameter("pincode");

String strRecordStatus = "ACTIVE";

Date dt = new Date();
HealthReport HR = new HealthReport(strPinCode,
strHealthIncident,
strRecordStatus,
dt);
DBUtils.saveHealthReport(HR);
strResponse = "Your Health Incident has been reported successfully.";
}
catch (Exception ex) {
_logger.severe("Error in saving Health Record : " + strHealthIncident + "," + strPinCode +  " : " + ex.getMessage());
strResponse = "Error in saving Health Record via web. Reason : " + ex.getMessage();
}
resp.getWriter().println(strResponse);
}
}

The main points of the code are :

1) We extract out the HealthIncident and PinCode request parameters. We do not do any particular validations but you could do all of that depending on your application requirement.
2. We generate the two other field values i.e. Date (ReportDate) and Status (ACTIVE).
3. Finally, we create a new instance of the HealthReport class, providing the values in the constructor. And then call the DBUtils.saveHealthReport(…) method that persists the record to the underlying datastore.
4. We display back a successfull message if all is well, which is what was visible in the screenshot above.

Let us look at the DBUtils.java class now. Please note that we have simply separated out the code into this file but you can manage/partition your code in any which way you like.

DBUtils.java

The DBUtils.java source code is listed below. Focus first on the saveHealthReport() method which was invoked by our servlet earlier. The other method, we shall come to that later on in the article.

Key Points are :

1. The saveHealthReport() method first gets the instance of the PersistenceManager through the PMF.java class that we wrote earlier.
2. It simply invoke the makePersistent() method on it. The makePersistent() method will take as a parameter the object that you want to persist. In our case it is the HealthReport.java class instance that we have created in the servlet. This method will persist the record and in the process also assign it a unique key.
3. Finally, we need to close the PersistenceManager instance by invoking the close() method.

The entire code listing is shown below:

package com.gaejexperiments.db;

import java.util.Calendar;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.logging.Level;
import java.util.logging.Logger;

import javax.jdo.PersistenceManager;
import javax.jdo.Query;

public class DBUtils {
public static final Logger _logger = Logger.getLogger(DBUtils.class.getName());

//Currently we are hardcoding this list. But this could also be retrieved from
//database
public static String getHealthIncidentMasterList() throws Exception {
return "Flu,Cough,Cold";
}

/**
* This method persists a record to the database.
*/
public static void saveHealthReport(HealthReport healthReport)
throws Exception {
PersistenceManager pm = PMF.get().getPersistenceManager();
try {
pm.makePersistent(healthReport);
_logger.log(Level.INFO, "Health Report has been saved");
} catch (Exception ex) {
_logger.log(Level.SEVERE,
"Could not save the Health Report. Reason : "
+ ex.getMessage());
throw ex;
} finally {
pm.close();
}
}

/**
* This method gets the count all health incidents in an area (Pincode/Zipcode) for the current month
* @param healthIncident
* @param pinCode
* @return A Map containing the health incident name and the number of cases reported for it in the current month
*/
public static Map<String, Integer> getHealthIncidentCountForCurrentMonth(String healthIncident, String pinCode) {
Map<String, Integer> _healthReport = new HashMap<String, Integer>();

PersistenceManager pm = null;

//Get the current month and year
Calendar c = Calendar.getInstance();
int CurrentMonth = c.get(Calendar.MONTH);
int CurrentYear = c.get(Calendar.YEAR);

try {
//Determine if we need to generate data for only one health Incident or ALL
String[] healthIncidents = {};
if (healthIncident.equalsIgnoreCase("ALL")) {
String strHealthIncidents = getHealthIncidentMasterList();
healthIncidents = strHealthIncidents.split(",");
}
else {
healthIncidents =  new String[]{healthIncident};
}

pm = PMF.get().getPersistenceManager();
Query query = null;

//If Pincode (Zipcode) is ALL, we need to retrieve all the records irrespective of Pincode
if (pinCode.equalsIgnoreCase("ALL")) {
//Form the query
query = pm.newQuery(HealthReport.class, " healthIncident == paramHealthIncident && reportDateTime >= paramStartDate && reportDateTime < paramEndDate && status == paramStatus");

// declare parameters used above
query.declareParameters("String paramHealthIncident, java.util.Date paramStartDate, java.util.Date paramEndDate, String paramStatus");
}
else {
query = pm.newQuery(HealthReport.class, " healthIncident == paramHealthIncident && pinCode == paramPinCode && reportDateTime >= paramStartDate && reportDateTime <paramEndDate && status == paramStatus");

// declare params used above
query.declareParameters("String paramHealthIncident, String paramPinCode, java.util.Date paramStartDate, java.util.Date paramEndDate, String paramStatus");
}

//For each health incident (i.e. Cold Flu Cough), retrieve the records

for (int i = 0; i < healthIncidents.length; i++) {
int healthIncidentCount = 0;
//Set the From and To Dates i.e. 1st of the month and 1st day of next month
Calendar _cal1 = Calendar.getInstance();
_cal1.set(CurrentYear, CurrentMonth, 1);
Calendar _cal2 = Calendar.getInstance();
_cal2.set(CurrentYear,CurrentMonth+1,1);

List<HealthReport> codes = null;
if (pinCode.equalsIgnoreCase("ALL")) {
//Execute the query by passing in actual data for the filters
codes = (List<HealthReport>) query.executeWithArray(healthIncidents[i],_cal1.getTime(),_cal2.getTime(),"ACTIVE");
}
else {
codes = (List<HealthReport>) query.executeWithArray(healthIncidents[i], pinCode, _cal1.getTime(),_cal2.getTime(),"ACTIVE");
}

//Iterate through the results and increment the count
for (Iterator iterator = codes.iterator(); iterator.hasNext();) {
HealthReport _report = (HealthReport) iterator.next();
healthIncidentCount++;
}

//Put the record in the Map data structure
_healthReport.put(healthIncidents[i], new Integer(healthIncidentCount));
}
return _healthReport;
} catch (Exception ex) {
return null;
} finally {
pm.close();
}
}
}

Assuming that your application is running, you can view the data records that are being persisted. If you navigate to http://localhost:<YourPort>/_ah/admin in your browser, you will see a screenshot similar to the one shown below:

The screen above shows the Entity types that are currently having some data records. In our case it is the HealthReport entity of which we have saved one record so far. If you click on the List Entities button, you can see the records that are currently persisted for the HealthReport Entity Type. A sample screenshot from my system after saving my first record is shown below:

Go ahead and populate a few more records in the database for different HealthIncidents like Cough, Cold, Flu (only). This is needed so that we can get some more data when we cover how to query persistent data in the next section.

ReportsServlet.java

Before we look at this servlet, let us look at the output that this servlet produces, so that it becomes easier to follow the code later. This is assuming  that you have added atleast 4-5 records using the /posthealthincident servlet that we covered above.

Shown below is the screenshot of the servlet output when I provide the following url:

http://localhost:8888/reports?type=HEALTHINCIDENTCOUNT_CURRENT_MONTH&healthincident=Flu&pincode=ALL
What we are asking for here is a report that gets all health incidents in the current month (type = HEALTHINCIDENTCOUNT_CURRENT_MONTH) for the healthincident = Flu and where pincode = ALL (irrespective of pincode)

Shown below is the screenshot of the servlet output when I provide the following url:

http://localhost:8888/reports?type=HEALTHINCIDENTCOUNT_CURRENT_MONTH&healthincident=ALL&pincode=ALL
What we are asking for here is a report that gets all health incidents in the current month (type = HEALTHINCIDENTCOUNT_CURRENT_MONTH) for the healthincident = ALL (irrespective of health incident which means all of them)  and where pincode = ALL (irrespective of pincode)

So what we have effectively done here is to query the set of Health Records that are present in our database using a variety of parameters (filters). In other words, if we take a SQL like aspect to it, we are saying something like this:

SELECT * FROM HEALTHREPORTS WHERE PINCODE = %1 AND HEALTHINCIDENT = %2 AND REPORTDATE >= %3 AND REPORTDATE < %4 AND STATUS = ACTIVE , etc.

The above SQL statement is just representative of the queries that we want to execute. So let us look at the code for the servlet first.

package com.gaejexperiments.db;

import java.io.IOException;
import java.util.Iterator;
import java.util.Map;

import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
@SuppressWarnings("serial")
public class ReportsServlet extends HttpServlet {
public void doGet(HttpServletRequest req, HttpServletResponse resp)
throws IOException {
resp.setContentType("text/xml");
String strResult = "";
String strData   = "";
try {

String type = (String)req.getParameter("type");
if (type == null) {
strResult =  "No Report Type specified.";
throw new Exception(strResult);
}
else if (type.equals("HEALTHINCIDENTCOUNT_CURRENT_MONTH")) {
String strHealthIncident = (String)req.getParameter("healthincident");
String strPinCode = (String)req.getParameter("pincode");
Map<String,Integer> _healthReports = DBUtils.getHealthIncidentCountForCurrentMonth(strHealthIncident,strPinCode);
if (_healthReports == null) {
}
else {
Iterator<String> it = _healthReports.keySet().iterator();
while (it.hasNext()) {
String healthIncident = (String)it.next();
int healthIncidentCount = 0;
Integer healthIncidentCountObject = _healthReports.get(healthIncident);
if (healthIncidentCountObject == null) {
healthIncidentCount = 0;
}
else {
healthIncidentCount = healthIncidentCountObject.intValue();
}
if (healthIncidentCount > 0)
strData += "<HealthIncident><name>" + healthIncident + "</name>" + "<count>" + healthIncidentCount + "</count></HealthIncident>";
}
}
strResult = "<Response><Status>success</Status><StatusDescription></StatusDescription><Result>" + strData + "</Result></Response>";
}
}
catch (Exception ex) {
strResult = "<Response><Status>fail</Status><StatusDescription>"+ "Error in executing operation : " + ex.getMessage() + "</StatusDescription></Response>";
}
resp.getWriter().println(strResult);
}

} </pre>
<pre>

The Servlet code is straightforward:

1. Currently it has only one type of report i.e. HEALTHINCIDENTCOUNT_CURRENT_MONTH
2. Next we extract out the Pincode and the HealthIncident request parameter values.
3. Then we invoke the DBUtils.getHealthIncidentCountForCurrentMonth method. This takes two parameters : pincode and healthincident that we have got from above step.
4. The method will return us a map where each record in the map will contain the key (String) containing the healthincident name and the value containing the count of incidents reported for that month. So something like [ {“Flu”,”20″} , {“Cough”, “30”} ,  {“Cold”,”10″} ]
5. Finally we simply format that into a XML output so that it can be returned to the client. And this is the exact output that we see in the browser.

Analyzing the DBUtils.getHealthIncidentCountForCurrentMonth method

I reproduce here the method from the DBUtils.java class that was listed before:


/**
* This method gets the count all health incidents in an area (Pincode/Zipcode) for the current month
* @param healthIncident
* @param pinCode
* @return A Map containing the health incident name and the number of cases reported for it in the current month
*/
public static Map<String, Integer> getHealthIncidentCountForCurrentMonth(String healthIncident, String pinCode) {
Map<String, Integer> _healthReport = new HashMap<String, Integer>();

PersistenceManager pm = null;

//Get the current month and year
Calendar c = Calendar.getInstance();
int CurrentMonth = c.get(Calendar.MONTH);
int CurrentYear = c.get(Calendar.YEAR);

try {
//Determine if we need to generate data for only one health Incident or ALL
String[] healthIncidents = {};
if (healthIncident.equalsIgnoreCase("ALL")) {
String strHealthIncidents = getHealthIncidentMasterList();
healthIncidents = strHealthIncidents.split(",");
}
else {
healthIncidents =  new String[]{healthIncident};
}

pm = PMF.get().getPersistenceManager();
Query query = null;

//If Pincode (Zipcode) is ALL, we need to retrieve all the records irrespective of Pincode
if (pinCode.equalsIgnoreCase("ALL")) {
//Form the query
query = pm.newQuery(HealthReport.class, " healthIncident == paramHealthIncident && reportDateTime >= paramStartDate && reportDateTime < paramEndDate && status == paramStatus");

// declare parameters used above
query.declareParameters("String paramHealthIncident, java.util.Date paramStartDate, java.util.Date paramEndDate, String paramStatus");
}
else {
query = pm.newQuery(HealthReport.class, " healthIncident == paramHealthIncident && pinCode == paramPinCode && reportDateTime >= paramStartDate && reportDateTime <paramEndDate && status == paramStatus");

// declare params used above
query.declareParameters("String paramHealthIncident, String paramPinCode, java.util.Date paramStartDate, java.util.Date paramEndDate, String paramStatus");
}

//For each health incident (i.e. Cold Flu Cough), retrieve the records

for (int i = 0; i < healthIncidents.length; i++) {
int healthIncidentCount = 0;
//Set the From and To Dates i.e. 1st of the month and 1st day of next month
Calendar _cal1 = Calendar.getInstance();
_cal1.set(CurrentYear, CurrentMonth, 1);
Calendar _cal2 = Calendar.getInstance();
_cal2.set(CurrentYear,CurrentMonth+1,1);

List<HealthReport> codes = null;
if (pinCode.equalsIgnoreCase("ALL")) {
//Execute the query by passing in actual data for the filters
codes = (List<HealthReport>) query.executeWithArray(healthIncidents[i],_cal1.getTime(),_cal2.getTime(),"ACTIVE");
}
else {
codes = (List<HealthReport>) query.executeWithArray(healthIncidents[i], pinCode, _cal1.getTime(),_cal2.getTime(),"ACTIVE");
}

//Iterate through the results and increment the count
for (Iterator iterator = codes.iterator(); iterator.hasNext();) {
HealthReport _report = (HealthReport) iterator.next();
healthIncidentCount++;
}

//Put the record in the Map data structure
_healthReport.put(healthIncidents[i], new Integer(healthIncidentCount));
}
return _healthReport;
} catch (Exception ex) {
return null;
} finally {
pm.close();
}
}

I have attempted to provide comments so that you can follow the code but I will list down the important parts here:

1. We are going to deal with the following classes : PersistenceManager and Query from the javax.jdo package.
2. We get the PersistenceManager instance via the PMF.java class that we wrote earlier.
3. We are using the Query class here to first build the query. For e.g.
query = pm.newQuery(HealthReport.class, ” healthIncident == paramHealthIncident && reportDateTime >= paramStartDate && reportDateTime < paramEndDate && status == paramStatus”);

What this means is that we are creating a query instance where we wish to get all records for the HealthReport class. Additionally we are passing a criteria string. Notice that the lefthand side are the fields of the HealthReport class (healthIncident, reportDateTime, status) and the right hand side are parameters which will define and then pass the values for to execute the query.

4. We define the parameters next as shown below:
// declare parameters used above
query.declareParameters(“String paramHealthIncident, java.util.Date paramStartDate, java.util.Date paramEndDate, String paramStatus”);
5. Finally we use the query.executeWithArray(…) method which takes as parameter an array that contains all the values for the above parameters that you have declared.
6. The executeWithArray(…) will return a List<> of HealthReport class instances that you can then iterate through the populate your result. In our code, we simply compute the total number for each of the health incidents (Flu, Cough, Cold).

Servlet Configuration

To complete our Servlet development, we will also need to add the <servlet/> and <servlet-mapping/> entry to the web.xml file. This file is present in the WEB-INF folder of the project. The necessary fragment to be added to your web.xml file are shown below. Please note that you can use your own namespace and servlet class. Just modify it accordingly if you do so.

<servlet>
<servlet-name>PostHealthIncidentServlet</servlet-name>
<servlet-class>com.gaejexperiments.db.PostHealthIncidentServlet</servlet-class>
</servlet>
<servlet>
<servlet-name>ReportsServlet</servlet-name>
<servlet-class>com.gaejexperiments.db.ReportsServlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>PostHealthIncidentServlet</servlet-name>
<url-pattern>/posthealthincident</url-pattern>
</servlet-mapping>
<servlet-mapping>
<servlet-name>ReportsServlet</servlet-name>
<url-pattern>/reports</url-pattern>
</servlet-mapping>

Running the application locally

I am assuming that you have already created a new Google Web Application Project and have created the above Servlets, web.xml , etc. So assuming that all is well, we will run our application, by right-clicking on the project and selecting Run As –> Web Application. Launch the browser on your local machine and try out the URLs that we have covered so far like :

1. Adding a Health Report record :

http://localhost:8888/posthealthincident?healthincident=Flu&pincode=400101

2. Reports

http://localhost:8888/reports?type=HEALTHINCIDENTCOUNT_CURRENT_MONTH&healthincident=Flu&pincode=ALL

http://localhost:8888/reports?type=HEALTHINCIDENTCOUNT_CURRENT_MONTH&healthincident=ALL&pincode=ALL

Please replace the port 8888 with your local port number.

datastore_indexes.xml

If you run your application locally, you will notice that everything is working fine. However, if you deploy this application the Google App Engine, you will not get any results where you query the reports. Why is that? It is because there are no indexes defined for your application.

If you visit your Developer Console at http://appengine.google.com and for your particular application, you will find no indexes are defined for the Datastore indexes link as shown below:

What are these indexes ? Indexes are generated for every possible query that you fire in your program. This is Google’s way of retrieving your data results efficiently. However when you run in local mode, these indexes are generated for your automatically. If you look in war/WEB-INF directory, you will find a directory named appengine-generated. Inside this directory is a file that is constantly updated called datastore-indexes-auto.xml. The contents of this file for the reports that we have run so far is shown below:


<datastore-indexes>

<datastore-index kind="HealthReport" ancestor="false" source="auto">
<property name="healthIncident" direction="asc"/>
<property name="status" direction="asc"/>
<property name="reportDateTime" direction="asc"/>
</datastore-index>

<datastore-index kind="HealthReport" ancestor="false" source="auto">
<property name="healthIncident" direction="asc"/>
<property name="pinCode" direction="asc"/>
<property name="status" direction="asc"/>
<property name="reportDateTime" direction="asc"/>
</datastore-index>

</datastore-indexes>

As you can see, there are two indexes generated for the searches that we fired and each of them contains the parameters that we queried the Health Reports on.

To get your deployed application to function correctly, you will need to copy these indexes to a file named datastore_indexes.xml.

You  will need to place this file in the war/WEB-INF folder of your application project.

Now deploy your project again. On successful deploy, revist the Developer Console –> Datastore indexes. You will find that the indexes have got created and are buing built as shown below:

Wait for a while. The building process goes through your current data and then on completion, you will see the screen as shown below (A refresh needed!):

Try out the queries now on your deployed application and they should work fine. You can also manual create the datastore_indexes.xml file but the local development server has this nice feature of auto generating out for you, so that you dont end up making mistakes. But remember to upload the updated datastore_indexes.xml as part of  your deployment others the queries will silently fail.

Conclusion

We conclude this episode in which we covered how you can persist and query data using the JDO API. I hope that it has given you enough inputs to start incorporating persistence into your applications. It is by no means a simple exercise especially if you are coming in from a SQL world. Not everything SQL-like can be converted as is to a NoSQL world so refer to the several excellent sources available on the web in your persistence journey. I highly recommend the official documentation along with the GAE Persistence Blog mentioned at the beginning of the article.

Categories: Uncategorized Tags:

GAEJ Experiments eBook

March 9, 2010 Leave a comment

It has been about a week now that I have released the GAEJ Experiments eBook. I am pleased to announce that it has been downloaded around 3500 times this past week.

To recap, the GAEJ Experiments eBook contains

  • All 15 episodes published at the site
  • A Bonus Chapter 16: Using the Datastore API

I thank all readers and my friends who made production of the eBook possible.

Download the eBook and distribute. It is free of cost.

Download it here.

If you like the book and wish to donate, there is a Donate button on the right. All proceeds will be given 100% to http://www.kiva.org to a charity of my choice.

I thank you once again for your support and hope to continue with more experiments. Work on version 2.0 has already begun …

Cheers
Romin

Categories: Uncategorized
Follow

Get every new post delivered to your Inbox.