Nifty tidbits

Nifty tidbits and random thoughts on technology and anything else that catches my fancy

Archive for the 'Java' Category


Desultory Monday…

Posted by Raghu on June 16, 2008

This entry was posted using Its all text on Firefox 3.0 RC2 on Ubuntu Hardy heron, with emacs 23 snapshot as the editor. I love it :-)

Well, Its all Text is great if you hate typing into webforms with textboxes that make editing such a big pain in the butt.

Its great to see that Its All text has been updated to work with FF 3.0 now. The fun would be to see if this works on Windows with cygwin emacs as the editor. Had problems the last time I tried that - but that’s been sometime ago now.

Today’s been a desultory Monday. Spent sometime getting emacs snapshot with pretty fonts on my hardy. Its beautiful.

The next thing has been mostly scratching my head on hadoop. What I’d like to do is parse an access log and generate multiple outputs - ie single input of gobs of web access logs and multiple outputs - with say requests by country, popular pages, % of client browser and so on.

  1. parse web log
  2. pull out remote ips and use geo ips to find the originating country
  3. pull out user agent field and figure out browser distribution.
  4. Filter the requested resource and pull out only pages - find pages by popularity

Now there seem to be quite a number of ways of doing this -

  • Code the whole thing in Java - and this is where I’m getting into analysis paralysis.
    Look at ways to generate multiple outputs from MapRed and then use Job and JobControl to setup the pipeline.
  • Use Pig - Pig examples on the Pig overview page seem to suggest that this should be trivial with Pig.
  • Use Cascading - seems to be doing the same thing - will need to do this in JRuby or Groovy though.

Will post an update once I get through the java route

Posted in Java | Tagged: | No Comments »

Scalability principles: Lessons from eBay

Posted by Raghu on June 5, 2008

InfoQ: Scalability Best Practices: Lessons from eBay

Great article on Ebay scalability principles.

 

Posted in Java | No Comments »

Maven Emma plugin - filters dont work

Posted by Raghu on May 24, 2007

Using v0.5 of the Maven emma plugin. Found that if you specify an include/exclude filter, no classes are instrumented….On digging a little deeper under the covers, it seems that there’s a bug with the Maven emma plugin - I’ve edited the plugin’s jelly script and using it here.

Posted the bug and the fix to Maven’s sourceforge developer forum. You can find it here

The fixed plugin.jelly’s here - Maven emma plugin 0.5 fixed

Posted in Java, Tips, Tools | No Comments »

Ant - Debugging classpath

Posted by Raghu on April 30, 2007

Here’s a nice tip on debugging classpaths in Ant: http://www.javalobby.org/java/forums/t71033.html

Essentially,

<?xml version="1.0"?>
<project name="project" default="default">
	<property name="lib" value="web/WEB-INF/lib"/>
	<property name="src" value="src"/>
	<property name="dist" value="dist"/>

	<path id="classpath">
	  <fileset dir="${lib}">
		  <include name="**/*.jar"/>
	  </fileset>
	</path>

    <target name="default" description="--> description">
    	<javac srcdir="${src}" destdir="${dist}">
    		<classpath refid="classpath"/>
	</javac>
    </target>

</project>

Posted in Java, Tips, Tools | No Comments »

Cobertura vs EMMA

Posted by Raghu on April 21, 2007

This is a rant against the Cobertura maven integration - and to a certain extent on Cobertura also - and hopefully an objective one ;). Hopefully this’ll help someone select between the two big (open source) coverage tools out there.

We’ve been using Cobertura at work till now and its done its job nicely - the reports look great and the maven 1.x integration, while not neat, is functional. While we knew that someday we were going to have to merge coverage data and have a single report across multiple test methods (junit, selenium tests and manual), cobertura documentation stated that this was possible and so we weren’t really bothered.

Thought I’d give it a whirl and set it up - and that’s when the trouble started. Atleast, with the maven integration.

First of all - there’s no way to just instrument code. You can generate the report (which will instrument classes and run the tests) but if you just want to instrument classes so that the final deployable contains instrumented classes, its a no go with the maven plugin tools.

Obviously, no point giving it up there - so thought I’d just include the ant tasks and go the ant way in my maven goal. Turns out that there’s no ‘plugin init’ kind of goal that can be called post build:start that will set up dependencies and import the cobertura ant tasks. You have to do it all yourself.  Fine - went that way too - so now my maven.xml uses the cobertura ant tasks and finally I’m able to generate an instrumented build. YIPPPPPPPEEEEEE…. or wait…lets’ just make sure that this thing works…

Does it?

Turns out - no - it doesnt - so I dropped the WAR into tomcat and accessed the login page of the application and then shut down tomcat nicely. There’s even a cobertura.ser created in the tomcat bin folder and I’m thinking that probably this will all work together finally…

So I go ahead, tweak my Maven.xml further with a coverage task that will merge the data from the junit runs and servlet container runs. Turn the switch on… and lo and behold…Exception reading the merged data file. Back to google and after hunting around for sometime found this Bug while merging reports

So finally I’m ready to give up cobertura and give Emma a try…and it couldnt have been better…

1. Goals are nicely setup

2. You can init the emma system with the emma:init goal and then use ant tasks if you want flexibility for doing things like merging reports.

3. The merging works :))

One sticky issue that I did come up  with was that the for the same source and test cases the coverage reported by cobertura and emma differ widely. With Cobertura, we were at 40% coverage while with EMMA, the number’s up to 60% coverage - and while EMMA has some literature on how it does things - I’d be glad if someone did explain why or how the reported numbers could be so different for the same base code and unit test suite?

Posted in Agile, Java, Tips, Tools | No Comments »