Solr search, Drupal and CentOS

by ekes

I keep meaning to write shorter blog posts. For once this one is easy to write as a short piece. Solr is a powerful search server, and there has been some great work making a solr module that integrates really cleanly with Drupal. I've kept putting off trying it out because of the expected pain of having to get the correct java version, and then fight with tomcat to get it to work properly etc. etc. But and it short...

It works!

It works out of the box with the openjdk-1.6.0 rpms, with the packaged jetty container. Follow the README to get the phplibrary and swap over the configs in the example as described. Run java -jar start.jar and your in business! The docs explain the packaged jetty is good for a single instance (smaller) production site even.

But it works with Tomcat too! Not quite out the box for me, but using the tomcat5 rpm and following the SolrTomcat "Configuring Solr Home with JNDI" section, just:

  1. copying the apache-solr-nightly/example/solr/ as the /my/solr/home and
  2. copying the apache-sorl-nightly/dist/apache-solr-nightly.war as /some/path/solr.war
  3. both to somewhere tomcat could get to it (it's running as user tomcat)
  4. swapping over the configs just as before with the jetty README version

Plus the not quite out the box bit. It was giving me the error SEVERE: Exception starting filter SolrRequestFilter Caused by: java.lang.RuntimeException: XPathFactory#newInstance() failed to create an XPathFactory for the default object model: http://java.sun.com/jaxp/xpath/dom with the XPathFactoryConfigurationException: javax.xml.xpath.XPathFactoryConfigurationException: No XPathFctory implementation found for the object model: http://java.sun.com/jaxp/xpath/dom. The libraries for this xml are installed in the dependencies (xalan-j2 and xerces-j2) but clearly not in a path expected somewhere. So for now I just symlinked them from location (/usr/share/java/.) into the (/usr/share/tomcat5/webapps/solr/WEB-INF/lib) directory. But even with tomcat with the rpms that was it!

Brilliant!

Comments

Hi, I want to ask something,

Hi, I want to ask something, perhaps it's out of topic. I just want to know, have you install Drupal on RedHat? Is there any difference with installing it in CentOS?

Erskine, Vps Hosting website

Removing the 'search/node' menu item

It is a little confusing having the 'search/node' displayed as the 'content' tab next to 'search' from the apachesolr_search module. Search is the default menu item, so in hook_menu_alter can have the $items['search']['page arguments'] = array('apachesolr_search'); and all the other searches are added from hook_search by search_menu as $items['search/'. $the_name_of_the_search '/%menu_tail'] so $items['search/node/%menu_tail']['type'] = MENU_CALLBACK; at least hides 'search/node' (it's still there have a look on this site :)

Two more slightly more thorough ways of removing core search

Slightly better path

Now I've popped the symlinks into /usr/share/tomcat5/shared/lib but sure there is a better way.

I made a symlink to

I made a symlink to /usr/share/tomcat5/shared/lib but it doesn't function for me :S
I have to google more :S

Hi Ekes, First off, thanks

Hi Ekes,

First off, thanks for the post, it was exactly what i was looking for. I do have a question though. I have a centOS 5 server on a VPS. I got openjdk up and running (used yum). however, I keep getting a "Could not reserve enough space for object heap" error when i try to run the start.jar file. (I tested on an ubuntu 9.4 dev machine and it runs fine). I have plenty of memory in the vps system, however i think i have tracked down the problem to the fact that the RAM is burstable and java is viewing the entire system memory (not just the guaranteed amount, but full burstable and possible more). I have read posts saying that sun's version tries to reserve too much memory which is what causes the issues described above on VPS machines. I am assuming openjdk is doing the same. I am not a java guy, so I honestly don't know anything about the inner workings of the VM or the solr app.
My question to you (if you know java), is: am i better off trying to get tomcat up and running because of the VPS issue. in other words do you notice if tomcat handles the memory reservation differently than openjdk.

thanks,

Sorry no expert on this but

Sorry no expert on this but tomcat is going to be a bit bigger, and it's still using java - but it is more configurable. But can you not just set the max and min heap to fit when executing java for the jetty container?

-Xms set initial Java heap size
-Xmx set maximum Java heap size

But you are going to need a bit of memory all of the time to run anything like this.