Tuesday, March 18, 2014

Stamping Text on Existing Pdf Files

I recently created a framework for stamping text on existing pdf files by using the iText project.  It is available here on github.

The workflow goes something like this:

  1. Create a PdfForm
  2. Add your PdfFormStamps (location, alignment, and rotation and defined in the PdfFormPlaceholder
  3. Run it through the PdfFormServiceImpl.stampPdf
  4. Then you can save the bytes wherever you like
Seeing the results is kind of clunky since it's kind of a guessing game where your stamps will show up, but better than nothing I suppose.  

Friday, March 7, 2014

Top 5 Ways to Make Better Software

I don't think I've ever met anyone involved in software development that didn't care about making a great product.  Of course there are varying degrees of caring, but I think for the most part most people are pretty passionate about creating something worth putting their name on.

Unfortunately, for a lot of software dev shops the product that's released doesn't turn out to be as good as the developers had hoped for.  There could be several reasons for this.  It could be that the devs are inexperienced, or maybe they had to cut some corners to meet the deadline.  There could be any number of reasons, but this article is meant to help us understand the top 5 ways for making better software.

  1. Test Everything (that matters)
  2. Peer Code Review
  3. Adopt Agile Principles
  4. Teach Each Other
  5. Have Fun
Before I get into the top 5 I should probably also give some context.  I've found that these 5 things are surprisingly absent in the organizations that would reap the biggest benefits: large enterprises.

Test Everything (that matters)

I'm always surprised when I get into a complex system only to find there are no tests.  So, the question of course pops up:  Writing tests takes too long, why would I want to write tests?

Reason 1: Regression tests.  The tricky thing with large and complex systems is that they can be really awkward to maintain.  The simplest change can land you into a tailspin of unexpected changes and, if things are really out of control, they can lead to new bugs being introduced.  So how can we have confidence we haven't broken anything else with the change we just made?  Hopefully there is a test suite that we can run to make sure we didn't inadvertently break anything.  Now, of course if simple changes cause huge headaches then there is probably something wrong with the software design, which leads us to reason 2.

Reason 2: As a design tool.  Test driven development(TDD) has been taking the industry by storm over the last several years and the reason is more than just making sure we have a test suite.  TDD is an excellent design tool that helps us avoid many common patterns that lead us to the problems that exist with a bad design, i.e. hard to maintain code.  

Automate the tests.  Of course even if we are writing good tests we may accidentally break something as we're adding a new feature.  To make sure we're getting the full benefit of our tests we want to make sure they run automatically.  For this reason we have continuous integration(CI) servers like Hudson or Jenkins.  They should run the tests at least once per day and send out any alerts if a test failure is found.  Another nice feature about CI servers is that they'll tie directly into your SCM, like SVN or git, so they can see what changes to the code base caused the test to fail.  

Testing what matters.  We could literally spend all day writing tests for all the code we could create so what is worth testing?  Business logic is probably the only thing worth worrying about.  All those getters and setters, probably aren't useful to test, unless there is some business logic in there.  Basically the rule I use is if the code wasn't automatically generated then it's probably worth testing.  

Peer Code Review

The peer code review is a dead simple tool that has a dramatic effect on code quality.  In its simplest form it's just having someone else looking at the code you wrote and giving you feedback.  Asking someone to look at your code doesn't mean you're a bad developer. 

Catch bugs before they go to QA.  The most obvious benefit for code reviews are going to be catching bugs before they leave development.  No matter how experienced a developer is sometimes our eyes begin to play tricks on us and we might type the wrong thing or maybe we understood the problem incorrectly.  At any rate, code reviews help catch bugs while it's still fairly cheap to fix. 

Share knowledge.  Another great benefit about code reviews is the knowledge sharing that happens. It isn't uncommon to find new ways of doing something while performing a code review.  As a developer I've learned a ton from looking at the code written by others.  

Pre-commit:  Generally there are two different ways of doing a code review.  Both are useful, but for different reasons.  A pre-commit code review would take place before you commit your code to the SCM. In essence you would call someone over and you would walk them through what you had done.  This allows for a quick 2-way dialogue and probably the most effective way to do a code review.  The only drawback is you need to find someone who is available when you're ready to commit.  

Post-commit.  In contrast the post-commit review is done after you have committed your code and is usually put into some kind of code review tool like Crucible or Gerrit.  The reviewer could then go look over the code when it's convenient.  As you can imagine any communication isn't nearly as effective as a face-to-face conversation would be.

Adopt Agile Principles

Agile is another one of those terms that has taken the industry by storm and it has come to mean a lot of different things to different people.  In a nutshell Agile is a philosophy that helps us focus on delivering the most value as soon as possible. Just like anything it takes practice to get it right, but the dividends are awesome. 

Agile values and principles.  Agile was born as just a bunch of ideas that were grouped together in the Agile Manifesto and also the Agile principles.  They're pretty easy to understand, but it might not be as easy to see how to apply them.  This is where an agile framework would be useful.

Scrum.  Scrum has become the most popular agile framework used in the industry today.  It builds off of the Agile values and principles and gives us some simple rules, roles, and meetings to follow that help us deliver working software as often as possible. A couple of paragraphs aren't nearly enough to adequately explain Agile and Scrum, but there are a ton of resources available.

Continuous Improvement. The biggest benefit your team will get from adopting Agile is the idea of continuous improvement.  This principle helps us to always keep an eye to the future to continually work better and better. 

Teach Each Other

Along the lines of always improving comes the idea of learning.  Learning is an absolute necessity for any developer to stay marketable.  I've also found that the best way to learn, and retain, an idea or skill is to teach it.  

Good for business, good for people.  Traditional organizations follow the mantra of only the leaders and mangers have ideas worth listening to and everyone else is there only to carry out those ideas.  An organization like that will generally find that their employees aren't very engaged and it's difficult to get new talent.  All in all, it's a pretty crappy place to work.  On the flip side, when an organization values their employees and their ideas people will want to work there and will feel a great desire to give their best effort. 

Culture of learning.   Part of learning is making mistakes, but the problem is our culture teaches that making mistakes is bad and because we think mistakes are bad we're afraid to make any.  This fear of making mistakes smothers our desire to innovate and experiment.  In this way the culture inhibits learning.  But, the best way I've found to overcome this fear is to create a classroom type setting where it's okay to make mistakes.  Little workshops and trainings can go a long way to help and eventually will spill over into the main day-to-day operations.

Have Fun

Creating software requires a lot more creative thought than most people think.  It can also be very stressful since the nature of software makes it very hard to predict when it will be finished.  Having fun is a great tool for battling the stresses that accompany software, but more importantly it increases our quality of life.

Laughter encouraged.  A little while ago I watched a video with famous comedian John Cleese about the effects of laughter.  Even if its forced, laughter has been shown to relax us and make us more healthy.  It also has a strange way to spreading to others.  When we hear others laugh we start laughing too, even if we didn't hear the joke. 

Change of scenery.  Sometimes the easiest way to have fun to switch things up a little.  Take time to talk with others, or take a walk as you find your brain reaching its breaking point.  Games and jokes are also good ways to bring out smiles.  We spend most of our waking time at work we should make it a place we enjoy being at.

Creative Juices.  Many people don't realize how much creativity is required for software development.  That might be because we often refer to building software as engineering which has a lot of connotations of calculations and mathematics.  For a lot of software that isn't the case.  It's more like growing a garden than building a bridge.  At any rate, fun and humor help to get our creative juices flowing so that we can keep growing our software garden into something pretty awesome.

So these are my top 5 ways to improve software that would have the biggest impact for most companies.  I'm positive there are a million other ways that would have a dramatic effect on improving software and I'd like to hear them all so please share any good ideas in the comments.  

References and Further Reading:

Testing Links:

Peer Code Review Links:

Agile Links:

Teach Links
Transforming Your Manufacturing Organization 
How Corporate Learning Drives Competitive Advantage:

Fun Links:
How to Play and Boost Creativity: http://www.helpguide.org/life/creative_play_fun_games.htm
Benefits of Laughter w/ John Cleese: https://www.youtube.com/watch?v=yXEfjVnYkqM

Monday, February 10, 2014

Spring MVC 3: Property referenced in indexed property path is neither an array nor a List nor a Map

JQuery's $.ajax does an excellent job mapping a json object to parameters, but when you start getting into more complex objects Spring MVC doesn't know how to interpret it sometimes.

For example if you have a json object like:

JQuery will map your parameters like


The problem is that Spring MVC is expecting a parameter format like


In order to get that you can do it in 1 of two ways. You can do the quick and dirty way, which changes the way you're building your json object. Or, the other way is to extend the jQuery plugin to build parameters differently.

To change the javascript code was pretty simple and looked something like this

var answers = {};
answers['beans[' + index +'].agreementId'] = agreementId;
answers['beans[' + index +'].answerId'] = value;

To modify the jquery plugin I would suggest taking a look here.

And for reference here are the pojos I was mapping to.

Friday, January 3, 2014

Split a Java List Into a List of Sublists

Often times I'm updating a large amount of records with a batch SQL script, and a lot of times I run into this error: DataIntegrityViolationException: Prepared or callable statement has more than 2000 parameter.

So to help me to update with only the allowed number of parameters I've created a method that will split a List of objects into sublists based on the max number of records you want.  And then I would pass each sublist to the batch update method.

Wednesday, December 18, 2013

How to Not Run Integration Tests With the Eclipse JUnit Plugin

Here is the scenario.  A Maven multi-module project being developed in Eclipse, the m2e plugin keeps the different modules in sync in eclipse, integration and unit tests mixed throughout each project, and integration tests using the naming convention *IntegrationTest.java.

Here is the problem.  Using maven to run unit tests for the entire project kind of sucks.  I mean it's easy to limit the tests being run with the maven sure fire plugin's exclude, but the problems are 1) it takes too long to build the project before the tests are run and 2) if you have altered a upstream project it won't show any compile problems in eclipse because m2e keeps them in sync, but the maven build doesn't seem to know about it so it fails to build.

Half of the solution

The first half of the solution is the JUnit plugin that comes with eclipse.  You can right-click on a project and Run As a JUnit Test, or ctrl+alt+shift+T, and it will run through all the tests really fast.
This is only half of the solution because it also runs through the integration tests, which generally take a long time.  

To work around running the integration tests you can create a test suite.  The test suite is a pretty good way to  limit what tests are run, but out of the box JUnit 4 requires you to either annotate each test with a @Category annotation on each test and then to use the @IncludeCategory or @ExcludeCategory on the test suite.  Or, the other less friendly version would be to use the @SuiteClasses annotation on the test suite and list out each test class individually.  When you have hundreds of test classes this is a lot of work.

The other half of the solution

What we really need is a way to use some wild cards to limit the tests we want to run.  Similar to the surefire plugin's exclude.  The solution I've come across is something called the JUnit Toolbox.  The cool thing about the project is that it has a couple of custom runners that allow you to specify wildcards in the classes to run. What that means is that we can write a test suite like this

   import org.junit.runner.RunWith;
   import com.googlecode.junittoolbox.ParallelSuite;
   import com.googlecode.junittoolbox.SuiteClasses;

   @SuiteClasses({"**/*Test.class", "!**/*IntegrationTest.class"})
   public class AllUnitTestsTestSuite {} 

and it will only run the tests in that project that don't have IntegrationTest.class.

Now the only downside I've come across is that if you have classes that match the naming convention and they don't have tests in them you will get a initialization error, but you can use categories to restrict those ones.  Also if you're looking for something to run tests for all of your projects this doesn't seem to address that either.  All in all though, I've gone from having to use a fairly manual process that takes 20-30 seconds with maven to a pretty automatic one that takes about 9.  

Tuesday, December 3, 2013

Spring @Scheduled method found on bean target class

I received this error while wiring up a method with the @Scheduled annotation on it.  The method was a concrete implementation of an interface so at runtime it was a proxy.  Apparently, the @Scheduled annotation has a hard time being on a proxy, but the workaround seems to be to put the annotation on the interface and the concrete impl. 

Here is a bug logged here.
And more information on @Scheduled.

Wednesday, November 27, 2013

Book Review: Planning for Big Data by Edd Dumbill; O'Reilly

Planning for big data gives a 50000 foot  view into the world of big data and data science.   It is perfect for the big data newbie and the author even says if you're already working with big data you should give the book to a friend.  I found it be a easy read that didn't take longer than 3 or 4 hours.
Get it for free
 The author glosses over the main areas of big data and provides almost a list of relevant technologies and services.  After reading this book you can get a small glimpse of how large the big data world is.
 Then author begins with why organizations are interested in big data and analyzing their data as quickly as possible.  Then we're introduced to some of the technologies that make it possible, such as Hadoop.
 Cloud computing is a big player in big data and the author gives great comparisons of the main cloud providers, specifically amazon Google and Microsoft.  It was very useful to have a side by side comparison of the services and the features they offer.
 The author also lists out many of the nosql solutions available and why a nosql solution can be beneficial for big data. He emphasizes an agile approach and how nosql allows developers to easily build on existing structure and how it helps to scale horizontally.
 And the last main piece that is mentioned is the importance of visualizations that are interactive and that can communicate complex ideas, as the author puts it, these are different than the normal boring bar graphs and charts put into slide presentations.
In summary I felt Planning for Big Data was a great starting point for someone who's beginning to explore the world of big data and it might even be useful for someone who is a little more familiar with big data to use as a reference of other tools and services that are available.