I've been looking to evaluate Pentaho CE and thought this might be useful for anyone involved in a similar project - it looks like a great product but a little difficult to find all the information together so I thought I'd try and make some notes as I went along to help others embarking on a similar project.
Ok you'll need the "CORE" Ubuntu appliance ISO.
I downloaded Sun's VirtualBox to host some of these turnkey appliances - makes it very easy to start and stop virtual systems whilst you are configuring/evaluating the particular appliance (although I've only tried this on WinXP (cough!) currently, I must do same on a full Ubunut server install). Using VirtualBox or VMWare or whatever, create a new virtual machine, call it whatever, "Pentaho" for example, if that's what it will be, accept the defaults for a Ubuntu linux device. Once the base virtual machine has been created by Virtual Box before starting it up you first need to ensure the Network is set to bridged - this will get a DHCP ip address from the local network when the virtualmachine boots, next go to the CD/DVD-ROM and point this to the "Core" turnkey appliance ISO you downloaded earlier.
Now start your virtual machine and it should, after a few seconds, start booting into the appliance. After a further few seconds, you'll get a menu screen with some options on it - choose the "install to hard-disk" and then just accept all the defaults. You'll need to setup a root password, eventually the sysem will be installed and it will need to reboot.Just before you do, unmount the CD/DVD from the "Devices" menu. then reboot. Eventually it will boot into your new virtual server and give some new menu options about network configuration etc. Make a note of your ip-address here, you're going to need it later.
Quit out of the menus until you're back to the command "login" prompt, login using "root" and the password you created when you installed the core system.
If necessary set the relevant keyboard (this is necessary for piping and redirection via the console on non-USA keyboards etc)
(some users might also need to reconfigure language too - "dpkg-reconfigure locales"
INSTALLING JAVA6 RUNTIME
Note: I'd tried using the "Tomcat" turnkey appliance installation prior to this and some of pentaho did work but I was also getting port conflicts on the Tomcat5.5 that comes with the Pentaho installation and in the appliance so I decided to go with a "core" appliance installation and install just what was necessary for Pentaho to run.
you're going to need to download some files from the repositories, if you're sat behind a proxy you might need to add the following lines to your "bash.bashrc" file in /etc/
You'll need to change the user/pass and proxy server ip address appropriate for you obviously.
(you'll need to logout for these to come into effect so just type "exit" and then login again - you could type bash to rerun
the shell but then you'd be running a shell within a shell and your outer shell won't have these settings so it's easier just to exit and login again)
The sun-java runtime is in the mutiverse so you're going to need to change the turnkey list of repositories to include the "multiverse" to be able to download the JRE, to do this just take out the # from the multiverse line in the two main repositories (in the core appliance I downloaded the first block of repositories point to the turnkey repositiories, the second and third blocks point to the standard ubuntu ones - just take out the # on these 2nd and 3rd blocks for the multiverse ones), these are held in the file;
which you'll need to edit with vi or vim, once this is done type;
this will update the list of packages that are available to download and which should now include the "multiverse" ones, followed by (I didn't check here to see if there is a Turnkey Multiverse repository which might be more appropriate)
apt-get install sun-java6-jre
(if you get 403 Forbidden errors, try re-editing the /etc/apt/sources.list.d/sources.list and changing all the http's to ftp and then rerun "apt-get install sun-java6-jre" - this worked for me with the proxy I was using)
when the Java is installing you'll see a licence option on the console which you'll need to ok for it to actually install. Eventually, your prompt will return. Now we need to setup the environment so Pentaho knows where Java is, just type
you should see a file called java listed, if it's there just type
and hopefully you'll see the version of java listed as 1.6 and some copyright stuff from Sun. Now go back into the bash shell configuration file and add the following line at the end of the file (use vi/vim etc);
once you've got back to the prompt, type "exit" to logout and then log back in and type;
hopefully you'll see;
(Note I initially set JAVA_HOME to /usr/bin as that's what I thought it should be based on some searches on google but then Pentaho will look for it in /usr/bin/bin/java which obviously isn't correct)
If not retrace your steps and check the relevant files.
Now download the version of pentaho you're wanting to install - I downloaded "biserver-ce-3.0.0-STABLE.tar.gz" to my local PC. You then need to get this across to your virtual server and in this case I used Webmin which is part of the core appliance setup. Open a browser and open the URL
you'll probably get errors about security certificates, accept/get/override these until it gives you the Webmin web-based login dialog box.
Login as your root and the password you set when you first installed the "core" linux appliance. From the menu choose "Tools"->"Upload and Download", then switch to the Upload tab. Using one of the browse buttons find the Pentaho package you downloaded and then just click on the upload button, a progress dialog will appear showing this file going across to your virtual server. Webmin will let you know when it's finally completed.
Now go back to your bash console on the virtual server. Create a folder off the root folder (i.e. /) called "pentaho" (i.e. "mkdir /pentaho")and move the file you just transferred via Webmin to it (note: this is just a personal preference, if you need to remove stuff later it's always easier if it's boxed off in a separate folder somewhere rather than trying to traverse a scattered file system to remove mutiple files/folders).
In your new folder ("cd /pentaho"), type "tar -xvf biserver-ce-3.0.0-STABLE.tar.gz", a whole raft of folders will appear as the relevant compressed files are extracted. At the end of this, if you type "ls -al" you should see the compressed file together with two new folders called "biserver-ce" and "administration-console". In each of these are two folders there should be some scripts for starting and stopping the relevant service, go into the biserver-ce one (i.e. "cd biserver-ce") and do an "ls";
You'll see the "start-pentaho.sh" script, at the prompt type "./start-pentaho.sh", you'll see a few messages but after a few seconds the server will (hopefully :-)) have started.
Run up a browser on your PC and type "http://<server-ipaddress>", with any luck you'll be presented with the "Pentaho" user login window
Similarly, back in the virtual machine console, just press return to get back to a prompt and then do a "cd ../admin*" to go into the "/pentaho/administration-console" folder. At the prompt, type "./start.sh" and the admin console should start up.
Now open another browser window/tab on your PC and navigate to "http://<server-ipaddress>:8099/", you should get the Admin console window and a request for the username and password login parameters. Type "admin" and "password" and then login, you should then have the option to create and manage the users using the administration sections.
You should also be able to see that the Pentaho service is running by the "Server Status" icon above the central frame (it will have a red-X on the icon if the service is not running).
To exit from the administration-console press Ctrl-C, I get the impression this should only run when you need it. Also there doesn't seem to be any necessity to run the shutdown script ("./stop.sh") to stop it (as there would be with pentaho itself) as the CTRL-C seems to stop the service running. EDIT: I had a process go awry at one point and I had to switch to a different console (ALT-F2) and ran the ./stop.sh which seemed to work eventually so perhaps it is there for good reason.
That should at least have got you started.
Now shutdown the services and navigate to "/pentaho/biserver-ce/tomcat/webapps/pentaho/WEB-INF", here vi/vim the "web.xml" file and change the relevant context entry from "localhost:8080" to the address of the server as in;
and then restart the server, although the sample reports were sort of working without the above, I noticed some of the charts were looking for the images that had been created for them on "localhost" (i.e. they were looking on the local PC and not on the server), hopefully this fixes that. Note: I had a little trouble here for a minute or so not sure what I'd
done but pentaho wouldn't let me login as a user I'd created using the administration-console web utility (it kept giving me authentication errors). Eventually after stopping the service a couple of times and restarting it it let me in and the reports seemed to start working.
One side issue of setting this ip-address is that to run the google api demo you'll need to register the ip address with google - there's a link in pentaho to do this directly although you will need to edit a file within the pentaho configuration to get this to work from there on in.
MS SQL Server access - I downloaded the JTDS sqlserver driver from Sourceforge (just google it for the relevant URL if you need this) - once you've got the file extract the " jtds.1.2.2.jar" file and put this in "/pentaho/biserver-ce/tomcat/common/lib" and the "/pentaho/administration-console/jdbc/" folders which is where the other JDBC drivers appear to be held. Then stop and start the services (at least the console anyway) hopefully the driver will now appear in the drivers option on the "Add Data Source" in the administration console. Look at the website documentation (it's actually in the driver compressed file too) to get the right format for the URL. Note I couldn't get this to work with the localised EXPRESS version of SQL Server but it seemed to work fine with the server versions. I tried the m$ one too but it just gave errors (available from their website) wehn I tried to connect - if anyone knows how to configure these properly I'd be grateful.
All-in-all it seems a good BI suite and if you're comparing a commercial Reports/BI suite with what there is in the OpenSource arena then it's definitely worth a look. Even if you decide to go with a commerical option for support reasons there's always the commercial version of Pentaho (and it comes with documentation too!).