Internship at WSO2: 2012

Tuesday, December 25, 2012

Good Bye WSO2

It has been a really wonderful stay at WSO2. I can call it a fully featured internship. There was exposure, new connections, new learning, and fun. So thank you ALL (I really mean it) for ALL the help and support throughout the internship period and all the fun we had together.

There are some people that I personally thank.

I like to summarize my internship.

Best Moment in my Internship period

Committing my changes to the branch (in Greg project) – I worked in several projects (around 5). Only in 2 projects I got chance to add them to the branch. It was in test automation and in Greg basic filter improvement. It feels great when we see our changes in the branch, knowing that it will be there in the future and thousands of people gonna use it. And I hope all other futuristic projects I did will be used in future. (https://svn.wso2.com/wso2/repo/intern/malinga/)

Worst moment in my Internship period

Missing the BB finals – BB final was one of the moment I wanted to be in, form the time we started the BB tournament (read a blog article by a past wild boar member and how they win the final match in the last minute). But I missed it due to the bad weather.

Best feeling in my Internship period

Non tech guy knowing WSO2 – in the last few months almost all the people met me, asked about my training place. So my answer was “A Place call WSO2”. Most of them are non-tech or in different fields and didn't know about WSO2. But 1% of them replied “Ahh.. WSO2”. It feels great and I was like “You know WSO2 :D”.

Worst feeling in my Internship period

Returning to the room after losing each and every TT match played– Luckily it happens extremely rarely because I don’t return till I win at least a single match.

Contact Me

Email –             malingaonfire@gmail.com
FB -                  https://www.facebook.com/romesh.perera.08
Linked in -         http://lk.linkedin.com/in/malingaperera
Twitter -             https://twitter.com/malingaperera
Blog on my Internship (Internship at WSO2) -
                         http://iwso2.blogspot.com/
Personal blogs - http://www.executioncycle.lkblog.com/
                         http://www.thinklikemalinga.lkblog.com/
OR you can simply Google “Romesh malinga perera”

Wish you a merry Christmas and a happy new year!

Thursday, December 20, 2012

Greg Basic Filter Improvement (Enhanced basic filter) – Final

All done and Committed. It was a good learning period for me. So let me give you a full summery of the full project. And I like to give this as two separate documents one for users and one for developers.

You can find the documentation from the below links.

Documentation – For Developers

Documentation – For Users

Greg Basic Filter Documentation For Developers

Coming Soon....

Greg New Basic Filter Documentation For Users

Introduction

This new basic filter allows you to shape the basic filter as you need. Early greg basic filter allowed you to search the artifacts by name. With new basic filter you can decide with what you want to filter with. It might be the version, name space, or some minor attribute.

In new Greg all artifacts are defined with a RXT file you can easily access them form Extensions > Artifact Types as shown in the below screen capture.

You can view and Edit them from here. Below I have shown a part of a sample RXT file.


    /trunk/services/@{namespace}/@{name}
    overview_name
    overview_namespace
    
        
            
                
            
            
                
            
            
                
            
        
    
    
        
                Name
            
                Namespace
            
                Version
            
                Scopes
            
                Types
            
                Description
            
            
            
            
            
            
            
        
        
                Contact Type
                Contact Name/Organization Name
            
                Contact
                
                    None
                    Technical Owner
                    Business Owner
                
            
            
            
        
         .
 .
 .

Below extracted part of it decide on the columns in the listing page.

According to above configuration we will have two columns (namely Service Name, Service Version, Service Endpoints) in Services listing page. User select them according to his priorities. New greg basic filter allows you to search by all the columns defined in the RXT. So in this case two attributes mentioned above will be there to select for before filtering. Other than that this will provide the capability of search by the Life-Cycle as a new feature.

Life cycle Filter (Use of negative filtering 'NOT')

Life cycle Filter

Other 'filter by' criteria will be followed by a text-box

Filter By Life-Cycle

This is something new that comes with the greg 4.5.3 with this new basic filter. Even the advance filter don't have the capability of search with Life-Cycle. This LC filter allows you to execute vast range of queries with the capability to do the Negative searchers too.

Example queries that can be executed by the LC filter.

1. Service that are not in the ABC life-cycle.

2. Services that are in the ABC life-cycle in the Development State

3. Services that are in the ABC life-cycle not in the Development State

Try it and you will easily understand how to use it.

Limitations

1. All listing columns will be there to search with and you don't have option to omit few of them from the search only. Or in reverse you can't have a attribute to search with but not as a column in the searching page. Simply speaking items in the filter by drop down and the columns in the listing page will be same.

2. You can't filter with a Unbounded list.

Lunch from the Mentor @ The Mango Tree

It was nice lunch from our mentor Amila Maharachchi. Me and Lasindu with other Stratos members were part of this. Below are some nice captures from Lasindu's cam.

Stratos team @ The Mango Tree

Malinga Perera(Me) on left with Damitha Kumarage

Sanjaya Ratneweera (On left) With one of our metors Amila Maharachchi

Shariq Muhammed (Our other mentor) with Lasindu

Thursday, December 13, 2012

Axis 2 Handler for Custom Greg Email Notifications

What I needed to do.

We needed to use greg to help our support team. And we hosted all patch information in a greg instance. In greg we have subscription options any one who like to get updates on a specific artifact can subscribe to it. You can learn more on this subscription process in this article.

We needed to customize the emails sent via greg and we wanted to remove all unwanted details and give it a nice informative subject. I got to know it could be done with a axis 2 handler.

Process:

I followed this sample found in the wso2 product wiki. And I needed to change the code there. You can go though that sample and try the basics first. And when you do it remember to insert relevant dependencies, like axis. They are not in the pom.xml and sample does not ask you to add them. You will lead to a compiler error of not finding axis bundle. You can overcome this by removing the below exclusion from the pom.xml

               <exclusion>
                    <groupId>org.wso2.carbon</groupId>
                    <artifactId>org.wso2.carbon.context</artifactId>
               </exclusion>

Removing above form the pom will get rid of following error

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.3.2:compile (default-compile) on project org.wso2.carbon.registry.samples.handler: Compilation failure: Compilation failure:

[ERROR] /home/malinga/work/Servers/wso2greg-4.5.1/samples/handler/src/src/main/java/org/wso2/carbon/registry/samples/notifications/EmailTransformHandler.java:[6,23] package org.apache.axis2 does not exist

After you are done with this you can look at my code. I change the EmailTransformHandler.java to suit my needs. We are subscribed to ChangeLCState event, and we got a mail for all state changes.

Requirements:

We needed to Change the Subject to [QA] Patch <Artifact name>

We needed to Change the Massage Body

We needed to send the massage for some selected state changes only

How I did it:

We needed to Change the Subject to [QA] Patch <Artifact name>

You can get the transport headers like below and you can change the subject to what ever you need it to be, like below.

((Map<String, String>) msgContext.getOptions().getProperty(MessageContext.TRANSPORT_HEADERS)).put(MailConstants.MAIL_HEADER_SUBJECT, subject);

We needed to Change the Massage Body
It is same as how it is explained in the sample in the wso2 product wiki. Here the main problem is you only get the text body and you have to do lot of string manipulation to get what you need out of it. And if the original massage changes everything will go wrong. Below are some samples from my code

String sender = element.getText().substring(findStringInString(element.getText(), "This message"));

when you get the required information out you can create a string with the body you need and set it as the message body

if(element.getText().contains("'Development' to 'ReadyForQA'")){

                   element.setText(msgPromotedFromDev);

                }

We needed to send the massage for some selected state changes only

We can do a check on the message header, body and if it is not not needed massage you can use InvocationResponse.ABORT to stop the mail from sending, you can find the example from my code below.

if(element.getText().contains("'Development' to 'ReadyForQA'")){

                   element.setText(msgPromotedFromDev);

                }else if(element.getText().contains("'ReadyForQA' to 'Testing'")){

                    element.setText(msgPromotedFromQA);

                }else if(element.getText().contains("'Testing' to 'Released'")){

                    element.setText(msgPromotedFromTesting);

                }else{

                    return InvocationResponse.ABORT;

 }

Hope you got some understanding of Notification E-mail Customization and you can find more simple samples in http://docs.wso2.org/wiki/display/Governance451/Notification+E-mail+Customization+Sample

Sunday, December 9, 2012

The Most important grep (Some most useful commands/ways with grep)

I started my internship in wso2 6-7 months ago. By then I had no clue about the Linux command prompt. Even though I have worked with Ubuntu in when I was doing my A/Ls I didn’t use the bash to anything at all. But with the start of my internship I started Using the Command line for almost everything. It is a one way stop for everything and you feel really powerful and feel more control over things. With those I moved towards using the Command line more and more. When I do that ‘grep’ played a big part in my life. Simply you can’t survive in a command line environment without ‘grep’. Below are some nice little commands using grep that I think everyone should know.

basic usage of grep command is to search for a specific string within a specified set of files. In below commands you have to replace the <> and what is in there with what you need.

grep "<string you need to search>" filename

e.g. grep “submitFilterForm” index.jsp

Here we are searching for “submitFilterForm” in index.jsp. This will return the sentence that string can be found (if there is any).you can give a file pattern instead of the file name

e.g. grep “submitFilterForm” *.jsp

And if you need you can replace the search string with a regex. You can use some parameters to make the search more advances.

i - Ignore case(ignore capital, simple, “the”, “THE” and “The” all will be same)
w - Full strings only (if you don’t use this all the substring matches also will be there in the search)
v - Negative search. ( When you want to display the lines which does not matches the given string/pattern)
n - Line numbers (To show the line number of file with the line matched. It does 1-based line numbering for each file. Use -n option to utilize this feature.)

e.g. grep -iw "submitfilter" *.java

Above will search for "submitfilter" as a full word, ignoring case insensitively within all java files, try it and you will understand more.

e.g. grep -i "submitfilter" *.java

Above will search for "submitfilter", ignoring case insensitively within all java files.

e.g. grep -v "submitfilter" *.java

This will search for places which do NOT match "submitfilter”, within all java files.

Friday, November 30, 2012

Hive Summarization script for billing needs

This contains links for the posts related to the Hive Summarization script for billing needs
Hive & Me Part 1
Hive & Me Part 2

Database Space usage monitoring for storage server

For our new product “Storage Server” we need monitoring functionalities. It was possible to change my main project to cater that need. In my main project I collect usage data for all the tenants and publish data of the usage exceeded tenants to the BAM server. I just needed to publish all the data to the BAM so it can work as a monitoring feature for Storage Server.
I removed some parts from my main project to suit the current need. I didn't need tenantBillingService”, which hold me back in the main project. And do not need to change “stratos common” (which is used to get the package details which was needed in the calculations) anymore, as we are publishing all the details that we are collecting. Component level architecture of the project will be

WSO2 Test Automation Hackathon - Summary

This contains links for the post related to the WSO2 Test Automation Hackathon
Beginning the Test Automation
Clarity Framework
Things that you should remember when test automation
Ending The Automation Hackathon

Measuring/Billing database usage in StratosLive - Summery

This Article collects all the posts under the Measuring/Billing database usage in StratosLive.

My Job
WSO2 Data Services Server User Guide
Need to find I/O rates, bandwidth used by each Database user
Limiting The Resource Use
I continued
Suggestions and replies
Collecting and summarizing the captured data
Followed the BAM samples.
Do you need data to play with?
Prototype version 1
Prototype version 1 has to be verified.
1st Verification
OSGi Services
Publishing to BAM
Using OSGi console to debug things

[Break for Test Automation]

Back to the Frozen project
WSO2 Storage Server
The Inevitable Change
Strange things do happen
Using Hive Scripts to Analyze and Summarize BAM data
Difference between two time ignoring the date
Replacing for ntask(quartz-scheduler), using timer task
It is almost 'THE END'

Wednesday, November 28, 2012

BB @ WSO2

We had a great basketball tournament lately, where our house become the first. Below schedule give some idea about the games we played and 4 houses we have. BTW I am a Wild Boar.

Match - Who won

match 1 - Cloud Bots

match 2 - Wild Boars

match 3 - Cloud Bots

match 4 - Wild Boars

match 5 - Titans

match 6 - Cloud Bots

With above we had match 5 and match 6 repeating as 3rd place match and final. Cloud Bots were fully confident of their win in the match 5

Unusual thing happened in the final matches. Heroes were changed, Legions became the 3rd and we(Wild Boars) Became the Champions. I was there for the finals (bad luck) but was there for all the games and even the practice matches. It was such a wonderful experience.

Below are some fine clicks by Harindu

CEO @ play

I am in blue

Wild Boars Captain (in blue)

GReg Basic Filter Improvement - Continues

If you haven't read the first part of the project following links will help you.

GReg Basic Filter Improvement - Starting
GReg Basic Filter Improvement - Feedbacks

I completed the project for GReg generic artifacts (go to the bottom of the page if you don't know what are generic artifacts). Now I have to do the same thing with inbuilt basic artifacts like Service, WSDL, Policy, Schema. In future all the artifact types will be defined with RXTs and there my current implementation will work for all. Till then I have to do the same changes with other inbuilt types.
So I decided to add the filter by LC and name to all inbuilt artifacts. But for services being the most important one we decided to have the full basic filter as in generic artifacts. following are some screen shots from the current state of the project.

Life cycle Filter (Use of negative filtering 'NOT')

Life cycle Filter

Other 'filter by' criteria will be followed by a text-box

GReg generic artifacts - When you deploy a GReg instance you get some in built artifact types. But if you need to have your own Metadata type, you can define it yourself. WSO2 Governance Registry provides the flexibility to configure and extend its functionalities and capabilities. One of its configurable capabilities is its Metadata model, which can be extended in such a way that anyone can use it to store any custom type of data, such as Services, WSDLs, etc., which is already there. To do this, you only need to add an XML file (registry extension or .rxt file) which includes the new Metadata model artifact as a resource to the registry.
This section contains detailed information on how to create the registry extension file up to content element, how to create a content which describes the data model of the artifact, how to deploy your file in the WSO2 Governance Registry, and how to add a menu item by adding a menu element to the registry extension file. More information: http://docs.wso2.org/wiki/display/Governance411/Configurable+Governance+Artifacts

GReg Basic Filter Improvement - Feedbacks

As I said before projects get change with the feedback that they are getting. So I like to shear some comments I got from the other members

I got the below comment from Senaka (GReg - Team Lead)

Hi Malinga,
Looks good. Now, say I take a random RXT Foo, it has three columns (in addition to Lifecycle Status),
1. Name
2. Version
3. Domain
How will this work? Will it be an exact match for name? or will what you enter only need to match a portion of the name (i.e. starts with). Will the same work for Version/Domain?
Lifecycle piece looks good, but there can be assets without an LC. For those, I think the "Select LC" box needs to also have a "None" in addition to what you have. Also, if no asset has an LC we don't display the LC-status column at all. In such a situation, the Filter-By should also not have it. Is that accounted for already?

So I decided to use the advance filter within the GReg to filter by all other fields other than life-cycle. So how does it work, I don't have to think about. And I haven't taken those 2nd, and 3rd facts in to the account.

SO like above some more feed backs came in and project got changed according to the feedback. I have listed some more comments below

Vijitha: "Can we make the drop down (second form left) "Is" & "Is Not" ?"

Senaka: "So this would now read as "Lifecycle is PatchLifeCycle in Any". May be its better to change Any in the state dropdown to "Any State", which will make it clearer to the user. Or he/she will have to expand the dropdown to understand what we have got there. I'm talking about a first time user."

I am really thankful to each and everyone who gave me feedback and advice. Here I listed some I found in the mail thread. I got some feedback from other people too, when I met them face to face.
So I will blog on what I will build in the future posts.

GReg Basic Filter Improvement - Starting

Introduction:
Currently in GREG we have a basic filter that can be used to filter services using its name. This is used to filter services on the fly without going in to the advance filter. This project will improve its functionality of that basic search to search main few columns that is selected by the rxt (columns which is in the list artifact page, these columns are defined in the rxt). Here our main concern will be searching with life-cycles.

Below sketch will give you a idea on how UI will change after this project.

Search by life-cycle:
When you select life-cycle from the first drop down (to filter by life-cycle) it will change the text-box next to it to a drop down menu listing the available LCs. and when you select a LC it will create another drop-down containing the possible states within the selected LC. Both of these drop-downs will have <any> item that will filter without considering the subject drop-down.

Search by other fields:
Where you select search by any other field it will show a text-box or a drop-down according to the selected column.

This is only the start:

This is only the start, this designs and functionality change with the feedback of the others. I might complete with a one that is not even close to what you see here. Keep in touch to know what happens.

Comment with any ideas you have, your ideas might get reflected in the next GReg release.

Thursday, November 15, 2012

Hive & Me Part 2

Continued from Hive & Me Part 1.....

After creating the required MySQL and Hive tables I moved in to the logic part of the script. I have to get the sum of all the bandwidth-In and bandwidth-Out entries separately. Then sum (bandwidth-In)-sum (bandwidth-Out) will give the current value and sum (bandwidth-Out) will give the history value. But doing it hourly is extremely costly. But if we can sum the entries from the last hour and calculate the current and history values based on the early current and history values, it will be better. I got to know we are keeping the time of the last run of the script in a MySQL table, and we write it to the hive configuration file using a Java class. I used that value to get the sum of the entries in the last hour. But it is not possible to add this last hour summarization to the previous current, and history values in the same query. So I add the summarization of the last hour with new id and sum the final and last hour rows in the table.

INSERT INTO TABLE REGISTRY_USAGE_HOURLY_ANALYTICS 
SELECT concat(TID, "LastHour"), TID, HISTORY_USAGE, CURRUNT-HISTORY_USAGE FROM 
(SELECT TENANT_ID AS TID,
        sum(PAYLOAD_VALUE) AS HISTORY_USAGE 
FROM USAGE_STATS_TABLE
WHERE USAGE_STATS_TABLE.PAYLOAD_TYPE = 'ContentBandwidth-Out' AND Timestmp > ${hiveconf:last_hourly_ts}
GROUP BY SERVER_NAME, PAYLOAD_TYPE, TENANT_ID) table1
JOIN
(SELECT TENANT_ID AS TID2,
        sum(PAYLOAD_VALUE) AS CURRUNT 
FROM USAGE_STATS_TABLE
WHERE USAGE_STATS_TABLE.PAYLOAD_TYPE = 'ContentBandwidth-In' AND Timestmp > ${hiveconf:last_hourly_ts}
GROUP BY SERVER_NAME, PAYLOAD_TYPE, TENANT_ID) table2
ON(table2.TID2 = table1.TID);

Above script get the summery of the usage in the last hour and inset it in to the table. Below query add the last our summary to final(in the last hour) and create the final value for the current hour.

INSERT INTO TABLE REGISTRY_USAGE_HOURLY_ANALYTICS 
SELECT concat(TENANT_ID, "Final"),
        TENANT_ID, 
        sum(HISTORY_USAGE) as HISTORY_USAGE,
        sum(CURRENT_USAGE) as CURRENT_USAGE
FROM REGISTRY_USAGE_HOURLY_ANALYTICS
GROUP BY TENANT_ID

This query results in a MySQL table where each tenant has two rows as 'final' and 'last hour'. Final row gives the current (size of all the data that user currently have in his directory) and history (size of all the data that user has deleted up to now). This information should be available to for each tenant correct to the last hour.

Hive & Me Part 1

Started with the new project to summarize registry bandwidth data (refers to the space used in the registry). As you might know we can have BAM to summarize data in Cassandra key spaces using hive scripts. It was not easy to work with lack of examples under hive.

What I have to do

There was a table in Cassandra that contains registry usage data. When a user adds or remove something from his registry a entry is marked as “registryBandwidth-In” (when we adds something) or “registryBandwidth-Out”(when he deletes something). I have to summarize those recodes in such a way that we have access to the current (size of all the data that user currently have in his directory) and history (size of all the data that user has deleted up to now). This information should be available to for each tenant correct to the last hour.

Implementation Plan

If I can write the current and history values in to a MySQL table, where each tenant will have a separate row, it is good enough. First I thought of having a table in hive with current and history values and a MySQL table mapped to it.

Below code uses the JDBC Storage Handler for Hive and more information on how to use it can be found in Kasun's blog: http://kasunweranga.blogspot.com/2012/06/jdbc-storage-handler-for-hive.html

CREATE EXTERNAL TABLE IF NOT EXISTS REGISTRY_USAGE_HOURLY_ANALYTICS ( 
        ID STRING,
        TENANT_ID STRING,       
        HISTORY_USAGE BIGINT,
        CURRENT_USAGE BIGINT) 
        STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler' TBLPROPERTIES (
        "mapred.jdbc.driver.class" = "com.mysql.jdbc.Driver", 
        "mapred.jdbc.url" = "jdbc:mysql://localhost:3306/WSO2USAGE_DB",
        "mapred.jdbc.username" = "root",
        "mapred.jdbc.password" = "root",
        "hive.jdbc.update.on.duplicate" = "true",
        "hive.jdbc.primary.key.fields" = "ID",
        "hive.jdbc.table.create.query" = "CREATE TABLE REGISTRY_USAGE_HOURLY_ANALYTICS (
        ID VARCHAR(50),
        TENANT_ID VARCHAR(50),
        HISTORY_USAGE BIGINT,
        CURRENT_USAGE  BIGINT)"
);

This will create a 2 tables, One is a Hive table and other is a mySQL table. Both will have the name "REGISTRY_USAGE_HOURLY_ANALYTICS" What ever we write the to the hive table will be written to the MySQL table. In the next code block I create a mapping to the MySQL table. Using this temporary hive table I can query the MySQL table.


CREATE EXTERNAL TABLE IF NOT EXISTS REGISTRY_USAGE_HOURLY_ANALYTICS_TEMP ( 
        ID STRING,
        TENANT_ID STRING,       
        HISTORY_USAGE BIGINT,
        CURRENT_USAGE BIGINT) 
        STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler' TBLPROPERTIES (
        "mapred.jdbc.driver.class" = "com.mysql.jdbc.Driver", 
        "mapred.jdbc.url" = "jdbc:mysql://localhost:3306/WSO2USAGE_DB",
        "mapred.jdbc.username" = "root",
        "mapred.jdbc.password" = "root",
        "hive.jdbc.primary.key.fields" = "TENANT_ID",
        "mapred.jdbc.input.table.name" = "REGISTRY_USAGE_HOURLY_ANALYTICS"
);

Continued to the part 2....

Friday, November 2, 2012

End of the quite October to hopeful November

It was a really quite October, when you look in to the blog, nothing is written. Actually it was a busy October, that is why I didn't had time to write articles to the blog. I worked with BAM and hive summarization scripts for BAM. So I am thinking about writing on "hive and summarization scripts for BAM". Next project I worked in was to improve the basic filter functionality of GREG basic filter and add filter by LC(life-cycle) to it.

What you should expect in the up coming month
About Hive
About BAM summarization scripts
About Bandwidth usage data summarization
About Greg Basic Filter improvement
About Greg LC filtering feature.

Looking for lot of articles in this November! Hopefully :)

Wednesday, September 26, 2012

It is almost 'THE END'

Now this is the summery of What I have done

It is agreed to measure the database space usage by each tenant. Here we will not limit the tenant(in terms of database access) on its DB usage but will keep track on excess DB space use by each tenant.

Component level view of the process.

Changes to each component:

Rss-manager: This component will be used to gather usage data from the RSS. And this will add those data to a queue which in turn will be retrieved by usage agent component. This Usage data collection will be handle through couple of newly added classes. And this is scheduled to be run daily. And it is configurable to run starting from a given time and repeated with given time gap(currently decided to run it in 24h intervals). Here we will only interested in tenants with exceeded usage. So it is needed to know the usage plan of a interested tenant, in order to get its limits. We thought of only publishing information about those tenants who exceeds the space limits, due to two reasons.

To reduce the data transfer between components and to the BAM server.
Exceeded DB size is all we need for billing calculations.

Usage-agent: This component will retrieve usage data from the queue(above mentioned) in the rss-manager. This is handled by newly added class, DatabaseUsageDataRetrievalTask. This is also scheduled to be run daily. And it is configurable to run starting from a given time and repeated with given time gap(currently decided to run it in 24h intervals).

Stratos-commons: This is where usage plan details are manipulated. Here plan details are read from 'multitenancy-packages.xml' and made available for use through a service. Here I have changed the xml file, xml reading class, data storing bean, to contain DB usage related data.

Dependencies: this depends on the yet to develop component (to get the tenant usage plan given the tenant domain/id) and that component is required for the RSS-Manager component changed to work perfectly.

Replacing for ntask(quartz-scheduler), using timerTask or scheduler

I tried using ntask(a scheduler that uses quartz-scheduler inside) but for some reason it is not working, I found no one to ask about that, ones that have used it don't have a clear idea of it. Therefor they can't figure out what went wrong. Unluckily non of my mentors are familiar with it. So I decided to find a alternative. So I thought of java in built TimerTask

According to quartz-scheduler web site they mention several things on “Why not just use java.util.Timer ?”

“There are many reasons! Here are a few:

Timers have no persistence mechanism.
Timers have inflexible scheduling (only able to set start-time & repeat interval, nothing based on dates, time of day, etc.)
Timers don't utilize a thread-pool (one thread per timer)
Timers have no real management schemes - you'd have to write your own mechanism for being able to remember, organize and retrieve your tasks by name, etc.

...of course to some simple applications these features may not be important, in which case it may then be the right decision not to use Quartz.”

Seems like I can live with TimerTask. I don't need any “persistence mechanism” as I am starting everything all over again with server startup. And with a server restart all the threads will be gone and I just need one thread, and don't need a thread-pool. And this is so simple and I don't need to manage my tasks after I create them. And On flexibility, As I know we can give a stat date(Util.date), and that is what I am using.

Timer timer = new Timer(true);
Calendar firstRun = Calendar.getInstance();

firstRun.set(Calendar.HOUR_OF_DAY,HOUR_OF_THE_DAY_TO_RUN_DB_USAGE_RETRIEVAL);
firstRun.set(Calendar.MINUTE, 0);
firstRun.set(Calendar.SECOND, 0);
timer.schedule(checker,firstRun.getTime(), 20000);

So I created the wanted date, by getting a date and setting the time part and we are good to go.

Difference between two time ignoring the date - Using Java

When you want to have the amount of time to 12.00 mid night, It doesn't really matter what is the date. If you want to measure time between two instants it is simple take time in milliseconds in both occation and then subtract one from another. But here we want to take difference between two times where one is a future time on the same date. In my application I wanted to schedule a script at 12.00 midnight by giving a delay. I can simply do it using a cron-job, but in my case it didn't work. So I wanted to know the delay between two times.

Calendar future = Calendar.getInstance();
future.set(Calendar.HOUR_OF_DAY, HOUR_TO_RUN_DB_USAGE_RETRIEVAL);
future.set(Calendar.MINUTE, 0);
future.set(Calendar.SECOND, 0);
Calendar now = Calendar.getInstance();
long requiredDelay = future.getTimeInMillis() - now.getTimeInMillis();

Here I take today this time as future and only change the time part(HH:mm:ss) so the date remains the same. Then get another time calendar instance for now and by subtracting getTimeInMillis() parts of those two you get the desired result.

Thursday, September 20, 2012

Using Hive Scripts to Analyze and Summarize BAM data

As I have completed up to publishing usage data, now I need to analyze and summarize those data. This can be simply done by a hive script and scheduling it within BAM. In the main menu of BAM you will find a manage menu. In manage menu, there is a menu item analyze. Under analyze menu item you get two more sub menus, one to list existing scripts and one to add new scripts.

Now go to 'add' sub menu there(Main>Manage>Analytics>Add). Here you get the chance to write your script and schedule it.

Bellow is a simple script written by Shariq Muhammed, SE @ WSO2. I used this script to summarize data in one of my tables created while pumping data in to BAM. I have removed some parts init as It won't be relevant to you.

CREATE EXTERNAL TABLE IF NOT EXISTS UsageStatsTable (id STRING,
        payload_ServerName STRING,
        payload_TenantID STRING,
        payload_Data STRING,
        payload_Value BIGINT,
        timestamp BIGINT) 
        STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler' WITH SERDEPROPERTIES ( 
        //sort Properties
);

CREATE EXTERNAL TABLE IF NOT EXISTS UsageStatsHourFact (id String, 
        hour_fact STRING,
        payload_ServerName STRING, 
        payload_TenantID STRING,        
        payload_Data STRING,
        payload_Value BIGINT) 
        STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler' WITH SERDEPROPERTIES ( 
        //sort properties
);

select #some columns from my table#

insert into table #some other table#

like above you can select and group the pumped data and insert summery data into a new table. If you don't know hive syntax, it is similar to SQL and you can have a great tutorial @ the following link, https://cwiki.apache.org/Hive/tutorial.html.

Tuesday, September 18, 2012

Strange things do happen

This a about a basic error that did. I got this error saying “class not found” in runtime. Let me explain the situation. I was calling a method from another osgi bundle staying in another osgi bundle. If I go in to more specific level I was working in usage agent component and I needed to call a method (a static method) that was in the rss-manager component.

In rss manager,

public static Queue <DBStatisticsEntry> getStats() {
return stats;
}

In Usage agent

dbStats = DBSizeChecker.getStats();

For a Java user it don't have any errors. But remember that this did give the error, “class not found” so before getting to know that method, it even don't know the class. And IDE don't give any errors, means that it is not a syntax error. And it don't give any warnings, no best practice violations.

My guesses of what has went wrong..

rss-manager jar inside the server do not have the DBSizeChecker class in it. As this class was added lately by me, I thought for a second what it the jar inside the server is not the current one that I am using? But opening the jar showed that class is there.
Problem with IDE, there might be a syntax issue but IDE won't show it due to some problem within it. And my server don't through any exceptions, only IDE gives me that exception while debugging.
Is this the right way to use static methods? I was confident that it was the way, but as I didn't have clue on what went wrong, I even did a search on “how to call a static method from another class in java”
rss manager might not have started, I checked it using osgi console and it was active and running.

And what was the real problem.

DBSizeChecker class was in the 'internal' folder in the OSGI bundle. If you don't know what that is, in OSGI we do not expose internal classes to the outside. Thats is why we didn't get a error from IDE but from runtime. In runtime OSGI comes in to action. This is basics in OSGI, but those matters.

This problem showed me how important is it to have someone to help when you get errors. This error is due to something simple but still I didn't get it, but when I showed it to a senior (kasun) he nailed the error in a minute.