Just watched demos for Vaadin and Play to get first impressions.
In both cases, I liked what I saw, but of course I need to learn a lot more about those frameworks
to see when and for which kind of projects I would choose either Vaadin or Play over Spring MVC,
which I have been using professionally for a while now.
It's obvious, however, that Spring MVC and Play share a similar, web-oriented notion of the MVC pattern, while I am not sure whether MVC plays much of a role in Vaadin at all, with its focus
on
RIA and one-page web applications.
(In Vaadin, I guess MVC would be applied in a way similar to desktop apps.)
Both at work and on this site, I use TWiki as my wiki engine of
choice. TWiki has managed to attract a fair share of plugin and add-on writers,
resulting in wonderful tools like an add-on which integrates KinoSearch,
a Perl library on top of the Lucene search engine.
This month, I installed the add-on at work. It turns out that in its current state,
it does not support Office 2007 document types yet, such as
.docx
,
.pptx
and
.xlsx
,
i.e. the so-called "Office OpenXML" formats. That's a pity, of course, since
these days, most new Office documents tend to be provided in those formats.
The KinoSearch add-on doesn't try to parse (non-trivial) documents
on its own, but rather relies on external helper programs which extract
indexable text from documents. So the task at hand is to write such
a text extractor.
Fortunately, the
Apache POI project just released
a version of their libraries which now also support OpenXML formats, and
with those libraries, it's a piece of cake to build a simple text extractor!
Here's the trivial Java driver code:
package de.clausbrod.openxmlextractor;
import java.io.File;
import org.apache.poi.POITextExtractor;
import org.apache.poi.extractor.ExtractorFactory;
public class Main {
public static String extractOneFile(File f) throws Exception {
POITextExtractor extractor = ExtractorFactory.createExtractor(f);
String extracted = extractor.getText();
return extracted;
}
public static void main(String[] args) throws Exception {
if (args.length <= 0) {
System.err.println("ERROR: No filename specified.");
return;
}
for (String filename : args) {
File f = new File(filename);
System.out.println(extractOneFile(f));
}
}
}
Full Java 1.6 binaries are
attached;
Apache POI license details apply.
Copy the ZIP archive to your TWiki server and unzip it in a directory of your choice.
With this tool in place, all we need to do is provide a
stringifier plugin to
the add-on. This is done by adding a file called
OpenXML.pm
to the
lib/TWiki/Contrib/SearchEngineKinoSearchAddOn/StringifierPlugins
directory in the TWiki server installation:
# For licensing info read LICENSE file in the TWiki root.
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details, published at
# http://www.gnu.org/copyleft/gpl.html
package TWiki::Contrib::SearchEngineKinoSearchAddOn::StringifyPlugins::OpenXML;
use base 'TWiki::Contrib::SearchEngineKinoSearchAddOn::StringifyBase';
use File::Temp qw/tmpnam/;
__PACKAGE__->register_handler(
"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet", ".xlsx");
__PACKAGE__->register_handler(
"application/vnd.openxmlformats-officedocument.wordprocessingml.document", ".docx");
__PACKAGE__->register_handler(
"application/vnd.openxmlformats-officedocument.presentationml.presentation", ".pptx");
sub stringForFile {
my ($self, $file) = @_;
my $tmp_file = tmpnam();
my $text;
my $cmd =
"java -jar /www/twiki/local/bin/openxmlextractor/openxmlextractor.jar '$file' > $tmp_file";
if (0 == system($cmd)) {
$text = TWiki::Contrib::SearchEngineKinoSearchAddOn::Stringifier->stringFor($tmp_file);
}
unlink($tmp_file);
return $text; # undef signals failure to caller
}
1;
This script assumes that the
openxmlextractor.jar
helper is located at
/www/twiki/local/bin/openxmlextractor
; you'll have to tweak this path to
reflect your local settings.
I haven't figured out yet how to correctly deal with encodings in the stringifier
code, so non-ASCII characters might not work as expected.
To verify local installation, change into
/www/twiki/kinosearch/bin
(this is
where my TWiki installation is, YMMV) and run the extractor on a test file:
./ks_test stringify foobla.docx
And in a final step, enable index generation for Office documents by adding
.docx
,
.pptx
and
.xlsx
to the Main.TWikiPreferences topic:
* KinoSearch settings
* Set KINOSEARCHINDEXEXTENSIONS = .pdf, .xml, .html, .doc, .xls, .ppt, .docx, .pptx, .xlsx
At this year's Java forum in Stuttgart,
I was one of 1100 geeks who divulged in
Suebian Brezeln and presentations on all things Java.
After a
presentation on Scala,
I passed by a couple of flipcharts which were set aside for birds-of-a-feather (BoF)
sessions. On a whim, I grabbed a free flipchart and scribbled one word:
Clojure. In the official program, there was no presentation
covering Clojure, but I thought it'd be nice to meet a few people who, like me,
are interested in learning this new language and its concepts!
Since I had suggested the topic, I became the designated moderator for this session. It
turned out that most attendees didn't really know all that much about Clojure or Lisp - and so
I gravitated, a bit unwillingly at first, into presentation mode. Boy, was I glad that right
before the session, I had refreshed the little Clojure-fu I have by reading an
article or two.
In fact, some of the folks who showed up had assumed the session was on clo
sures
(the programming concept)
rather than Clo
jure, the language
But the remaining few of us still had a spirited
discussion, covering topics such as dynamic versus static typing, various Clojure language
elements, Clojure's Lisp heritage, programmimg for concurrency, web frameworks, Ruby on Rails,
and OO databases.
To those who stopped by, thanks a lot for this discussion and for your interest.
And to the developer from Bremen whose name I forgot (sorry): As we suspected, there is
indeed an alternative syntax for creating Java objects in Clojure.
(.show (new javax.swing.JFrame)) ;; probably more readable for Java programmers
(.show (javax.swing.JFrame.)) ;; Clojure shorthand