I am a Permanent Member (or Community Ambassador) of the Drupal Association.Drupal is in love with Solr, as can be seen by the absolutely great session proposals that have been submitted for DrupalCon Copenhagen, in August. If you want to see a healthy dose of search goodness happening in Denmark, here are the links to go vote.
Drupal's file handling capabilities keep getting better. Beyond the core upload module, the filefield module for CCK has enabled us to build sites with all sorts of files; documents, images, music, videos, and so forth. Searching within these docuements, however, has never been a common feature on Drupal sites. Some solutions have existed, particularly for extracting texts from PDFs and common wordprocessing documents. With Apache Solr, the attachments module, and an extension library called Tika, things can be much better. With Tika you can extract texts not only from Microsoft Office, Open Office, and PDF documents, you can also get text and metadata from images, songs, Flash movies and zipped archives. Searching for these texts is done as part of the normal Apache Solr driven site search.
For the last six months, Scott Reynolds has been keeping a big juicy secret. As the maintainer of the Apache Solr Views module, he knows just how cool the future of Drupal Search is going to be. His module, based on an idea and code from Thomas Seidl, lets you make custom searches against the Solr index the same way you currently make views against the MySQL database. Want to build a search that just includes videos and MP3s, and renders the results as a playlist? Or how about a search that is limited to the current user's images, displayed in a slideshow? How about a block that shows the latest results that contain the phrase "badgers are the new pony"? Well, even if you didn't want a block like that, with Views 3 and Apache Solr Views, you can have it.
Thomas Siedl's brilliant idea was that Views should be able to build "queries" against any data source, not just databases. Earl Miles agreed, and inagurated the Views 3 branch by commiting the patch by Thomas (with great help from Jeff Miccolis and others). With Views 3 I predict you'll be able to build Views using data from Flickr, or from RDF databases using SPARQL, or from the local file system, or from any other data source that has an API.
To test it all out I used the Acquia Drupal Stack to create a new site (I just love the stack's multisite functionality!). I then signed up for a trial Acquia Network subscription because I wanted to get my hands on 30 days of free Acquia Search (it's easier than setting up Solr myself). I then downloaded Views 3 and Apache Solr (DRUPAL-6--2, just for fun. DRUPAL-6--1 works, too). I had to get the Apache Solr Views module from CVS (Scott, make a devel release!). I put these in sites/all/modules so that they'd override the versions in the Acquia Drupal Stack.
The CVS command for getting Apache Solr Views
$ cvs -d:pserver:anonymous:anonymous@cvs.drupal.org:/cvs/drupal-contrib \ co -d apachesolr_views contributions/modules/apachesolr_views
I installed Apache Solr manually which means I also needed to get the SolrPhpClient library. Since I have Drush, and since Apache Solr DRUPAL-6--2 has Drush integration, I did it like this:
$ drush solr phpclient
I <3 Drush!
I then used FeedAPI to grab all sorts of content from Planet Drupal. I could have just as well used Drush and the Devel module to generate some content, but lorem ipsum gets mighty boring. Finally I used Drush to run cron and even did a search (from the command line!) to check that the content was in the index.
$ drush cron $ # wait a few minutes for the search index to commit the changes... $ drush solr search drupal node/175 by admin (user/1) title: Agile and Scrum Videos This is likely to become a pretty big collection of videos about Scrum and other other Agile based managements processes. (Drupal 5, Drupal 6, Drupal 7, Drupal Planet, Drupal Video) ... node/1 by admin (user/1) title: Welcome to your new Acquia Drupal website! If you are new to Drupal, follow these steps to set up your web site in minutes: Step 1 ... , forums, polls, tags, comments, ratings, and more. Acquia Drupal comes with many modules to power social publishing capabilities on your site. Hundreds of additional Drupal 6.x compatible modules ...
Now for the good stuff. When you make a new view in Views 3 you get asked what data source to use. Here you can see that I use the Apache Solr search index as a data source.
Then I added some fields. These are not the same fields that are available to node based views. They are specific to the underlying data source.
I also added a sort so that the results would be displayed according to the search score (keyword relevance).
In order to make this view seem like a "search" screen, it needs a search box, right? You get that by adding a search filter and exposing it. I could add more filters, too, like a filter to limit it to just one content type.
This shouldn't just be a copy of the normal search screen. The results should look different. To that end I told Views to render the results in a table.
Since we want this to be a page view it needs a path, and I went ahead and stuck it in the menu as well.
Finally, I want to be able to use Apache Solr's facet blocks along with the view. This is a three-step process.
It tastes great! Feast your eyes on this marvelous search screen.
The keyword search and the facet block interact seamlessly.
An interesting point to note is that there are no database queries used in retrieving the data or displaying it. No complex views query with lots of joins, and no node_load() calls for displaying the results. This method of querying Solr is just as efficient as using the normal Apache Solr search module.
To my mind, Views 3 and Apache Solr Views are the future of Solr search for Drupal. Even though they are both in heavy development, you can try them out and enjoy the great control you have over your search experience. There are many more handlers that need writing, too, so jump into the Apache Solr Views issue queue and help out. Since it all works with Acquia Drupal and Acquia Search, you can easily get up and running using an Acquia subscription. Enjoy!
The place to get nightly builds of the Solr project is http://people.apache.org/builds/lucene/solr/nightly/
It has been down for maintenance for over 24 hours which turned out to be very bad timing for me as I had just deleted my not-too-recent copy with the intent of testing the latest and greatest. This left me without any Solr available. Not to worry, building from source is painfully easy thanks to the awesomeness of Ant. Here’s how:
# Get the source code from the subversion repository.
svn co http://svn.apache.org/repos/asf/lucene/solr/trunk solrnightly
cd solrnightly
# This will launch the build process
ant example
# After the build finishes you can start the solr server.
# Move into the example directory...
cd example
# Now is the time to copy your solrconfig.xml and schema.xml
# into ./solr/conf
# And then start things up.
java -jar start.jar
In this article I will show you how you can write a tiny bit of code that will reveal new fields and facets for searching with the ApacheSolr module and Acquia Search. Using Acquia Drupal we’ll write an example module that takes the file type from CCK file and image fields and makes them into their own search fields. This results in us being able to filter our search results based on file type. This code fulfils the situation where you want, for example, to find a specific post that has a JPEG image, or all of the posts with PDFs that match a particular keyword.
Today I released Beta 2 of the ApacheSolr module, the module responsible for search on Robshouse.net. There are some bugfixes, some performance improvements, and a usability improvement, but most of all there is a new feature. With this release it is possible to sort search results. If you search on this site you will notice a block inviting you to Sort by, and your options are Title, Author, Type and Date. Author doesn't make much sense on this site since there is only one, but I think the other sorts will be useful to people who want to find things like the first post I wrote here about Drupal.
I'd also like to encourage people who are using the ApacheSolr module to send me links to their search pages and I'll start compiling a showcase.
I’ve relaunched RobsHouse.net on Drupal 5, yay! It was about time, too. For years it was running on a 4.6 Drupal (Civicspace, actually), and I had done nothing but break it more and more over the years. Thank goodness nobody could see the watchdog messages… PHP errors galore and just about everything else wrong as well. This site, however, is a different beast. While it is not comparable in complexity to some of the large media sites that are being launched on Drupal these days, it’s a pretty good effort for one guy over a long weekend. I’m using 54 custom modules on this site! Click here to see the full list. More details after the break.