This Article collects all the posts under the Measuring/Billing database usage in StratosLive.
My Job
WSO2 Data Services Server User Guide
Need to find I/O rates, bandwidth used by each Database user
Limiting The Resource Use
I continued
Suggestions and replies
Collecting and summarizing the captured data
Followed the BAM samples.
Do you need data to play with?
Prototype version 1
Prototype version 1 has to be verified.
1st Verification
OSGi Services
Publishing to BAM
Using OSGi console to debug things
[Break for Test Automation]
Back to the Frozen project
WSO2 Storage Server
The Inevitable Change
Strange things do happen
Using Hive Scripts to Analyze and Summarize BAM data
Difference between two time ignoring the date
Replacing for ntask(quartz-scheduler), using timer task
It is almost 'THE END'
Showing posts with label billing. Show all posts
Showing posts with label billing. Show all posts
Friday, November 30, 2012
Measuring/Billing database usage in StratosLive - Summery
Labels:
bam,
BAM2,
billing,
database,
database size,
mysql,
remote debugging,
StratosLive RSS,
usage agent,
WSO2 Data Services Server
Thursday, November 15, 2012
Hive & Me Part 1
Started with the new project to summarize registry bandwidth data (refers to the space used in the registry). As you might know we can have BAM to summarize data in
Cassandra key spaces using hive scripts. It was not easy to work with lack of
examples under hive.
What I have to do
There was a table in Cassandra that contains registry usage data. When a user adds or remove something from his registry a entry is marked as “registryBandwidth-In” (when we adds something) or “registryBandwidth-Out”(when he deletes something). I have to summarize those recodes in such a way that we have access to the current (size of all the data that user currently have in his directory) and history (size of all the data that user has deleted up to now). This information should be available to for each tenant correct to the last hour.
Implementation Plan
If I can write the current and history values in to a MySQL table, where each tenant will have a separate row, it is good enough. First I thought of having a table in hive with current and history values and a MySQL table mapped to it.
Below code uses the JDBC Storage Handler for Hive and more information on how to use it can be found in Kasun's blog: http://kasunweranga.blogspot.com/2012/06/jdbc-storage-handler-for-hive.html
CREATE EXTERNAL TABLE IF NOT EXISTS REGISTRY_USAGE_HOURLY_ANALYTICS (
ID STRING,
TENANT_ID STRING,
HISTORY_USAGE BIGINT,
CURRENT_USAGE BIGINT)
STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler' TBLPROPERTIES (
"mapred.jdbc.driver.class" = "com.mysql.jdbc.Driver",
"mapred.jdbc.url" = "jdbc:mysql://localhost:3306/WSO2USAGE_DB",
"mapred.jdbc.username" = "root",
"mapred.jdbc.password" = "root",
"hive.jdbc.update.on.duplicate" = "true",
"hive.jdbc.primary.key.fields" = "ID",
"hive.jdbc.table.create.query" = "CREATE TABLE REGISTRY_USAGE_HOURLY_ANALYTICS (
ID VARCHAR(50),
TENANT_ID VARCHAR(50),
HISTORY_USAGE BIGINT,
CURRENT_USAGE BIGINT)"
);
This will create a 2 tables, One is a Hive table and other is a mySQL table. Both will have the name "REGISTRY_USAGE_HOURLY_ANALYTICS" What ever we write the to the hive table will be written to the MySQL table. In the next code block I create a mapping to the MySQL table. Using this temporary hive table I can query the MySQL table.
CREATE EXTERNAL TABLE IF NOT EXISTS REGISTRY_USAGE_HOURLY_ANALYTICS_TEMP (
ID STRING,
TENANT_ID STRING,
HISTORY_USAGE BIGINT,
CURRENT_USAGE BIGINT)
STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler' TBLPROPERTIES (
"mapred.jdbc.driver.class" = "com.mysql.jdbc.Driver",
"mapred.jdbc.url" = "jdbc:mysql://localhost:3306/WSO2USAGE_DB",
"mapred.jdbc.username" = "root",
"mapred.jdbc.password" = "root",
"hive.jdbc.primary.key.fields" = "TENANT_ID",
"mapred.jdbc.input.table.name" = "REGISTRY_USAGE_HOURLY_ANALYTICS"
);
Continued to the part 2....
Wednesday, September 26, 2012
It is almost 'THE END'
Now this is the summery of What I have done
It is agreed to measure the database space usage by each tenant. Here we
will not limit the tenant(in terms of database access) on its DB
usage but will keep track on excess DB space use by each tenant.
Component level view of the process.
Changes to each component:
Rss-manager: This component will
be used to gather usage data from the RSS. And this will add those
data to a queue which in turn will be retrieved by usage agent
component. This Usage data collection will be handle through couple
of newly added classes. And this is scheduled to be run daily. And it
is configurable to run starting from a given time and repeated with
given time gap(currently decided to run it in 24h intervals). Here we
will only interested in tenants with exceeded usage. So it is needed
to know the usage plan of a interested tenant, in order to get its
limits. We thought of only publishing information about those tenants
who exceeds the space limits, due to two reasons.
- To reduce the data transfer between components and to the BAM server.
- Exceeded DB size is all we need for billing calculations.
Usage-agent: This component will
retrieve usage data from the queue(above mentioned) in the
rss-manager. This is handled by newly added class,
DatabaseUsageDataRetrievalTask. This is also scheduled to be run
daily. And it is configurable to run starting from a given time and
repeated with given time gap(currently decided to run it in 24h
intervals).
Stratos-commons: This is where
usage plan details are manipulated. Here plan details are read from
'multitenancy-packages.xml' and made available for use through a
service. Here I have changed the xml file, xml reading class, data
storing bean, to contain DB usage related data.
Dependencies: this depends on
the yet to develop component (to get the tenant usage plan given the
tenant domain/id) and that component is required for the RSS-Manager
component changed to work perfectly.
Labels:
architecture,
billing,
rss manager,
StratosLive RSS,
the end,
usage agent,
WSO2 Data Services Server,
WSO2 Storage Server
Tuesday, September 18, 2012
The Inevitable Change
Change is there in everything you see,
that is why 'change' is known as the only 'not changing thing'. When
I moved in to my old(main) project, I felt that I have to start from
the scratch again. Code-lines I wrote didn't work and almost all the
lines had errors in them. Those errors due to missing classes,
missing methods, changed signatures, etc. It was a hard job bring it
back to the earlier state. By now it is collecting and publishing
usage data as intended. I worked over a week on this, but added
nothing extra, only took it back to where it was.
Project Progress
What I have done
Completed collecting database usage
data.
Completed publishing them to BAM
What more to do
By now I don't have a way to get the
usage plan of each tenant, so for the sake of testing I have hard
corded it.
Need to analyze and summarize usage
data that was sent to BAM, this is done using hive scripts.
Need to cleanup and reformat the code
according to the best practices where I have missed them.
Subscribe to:
Comments (Atom)

